Tags: #distributed-training
Deep Learning Framework
python
31.0k
Lightning-AI/pytorch-lightning
A deep learning framework that simplifies PyTorch development by automating boilerplate engineering code, enabling scalable training from CPU to multi-node GPUs with minimal code changes.
Deep Learning Framework
python
41.4k
hpcaitech/ColossalAI
Colossal-AI makes training and deploying large AI models cheaper, faster, and more accessible through advanced distributed training techniques.
Reinforcement Learning Library for LLMs
Ray
3.1k
alibaba/ROLL
An efficient and user-friendly library for scaling Reinforcement Learning with Large Language Models on large-scale GPU resources.