Tags: #reward-modeling
Machine Learning Research Toolkit
python
1.5k
RLHFlow/RLHF-Reward-Modeling
A comprehensive collection of recipes and code for training various reward models crucial for Reinforcement Learning from Human Feedback (RLHF) in large language models.