An easy-to-use, scalable, and high-performance open-source framework for Reinforcement Learning from Human Feedback (RLHF) based on Ray and vLLM.