Reinforcement Learning Framework
9.3k 2026-04-13

OpenRLHF/OpenRLHF

An easy-to-use, scalable, and high-performance open-source framework for Reinforcement Learning from Human Feedback (RLHF) based on Ray and vLLM.

Core Features

High-performance and scalable RLHF with Ray + vLLM distributed architecture.
Unified agent-based design paradigm for extensible RL pipelines.
Comprehensive RLHF pipeline capabilities including SFT, Reward Model, and RL Training.
Support for advanced RL algorithms like PPO, REINFORCE++, GRPO, and RLOO.
Integrated VLM (Vision-Language Model) RLHF support and asynchronous training.

Detailed Introduction

OpenRLHF is a pioneering open-source framework designed for Reinforcement Learning from Human Feedback (RLHF), offering a production-ready solution for training large language and vision-language models. It leverages a distributed architecture combining Ray and vLLM for unparalleled scalability and performance. With its unified agent-based design, OpenRLHF simplifies the development and deployment of complex RL pipelines, supporting a wide array of state-of-the-art RL algorithms and enabling both single-turn and multi-turn agent interactions. It addresses the critical need for efficient and extensible RLHF solutions in the rapidly evolving AI landscape.

OSS Alternative

Explore the best open source alternatives to commercial software.

© 2026 OSS Alternative. hotgithub.com - All rights reserved.