Tags: #alignment
AI/ML Research Framework
Hugging Face
1.6k
PKU-Alignment/safe-rlhf
A modular open-source framework for training constrained value-aligned Large Language Models (LLMs) using Safe Reinforcement Learning from Human Feedback (RLHF).
A modular open-source framework for training constrained value-aligned Large Language Models (LLMs) using Safe Reinforcement Learning from Human Feedback (RLHF).