microsoft/torchscale
A PyTorch library providing advanced foundation architectures to efficiently and effectively scale Transformers for large language models and general-purpose AI.
Core Features
Quick Start
pip install torchscaleDetailed Introduction
TorchScale is a cutting-edge PyTorch library from Microsoft, dedicated to advancing the foundational architectures of large language models (LLMs) and general-purpose AI. It provides researchers and developers with a suite of innovative components like DeepNet for stability, Foundation Transformers (Magneto) for generality across modalities, and RetNet/LongNet for unprecedented capability and token scaling. By focusing on training stability, efficiency, and model generality, TorchScale empowers the creation of next-generation AI models that push the boundaries of scale and performance.