microsoft/torchscale - OSS Alternative - Discover Top Open Source Alternatives to Popular Software
Deep Learning Architecture Library
3.1k 2026-05-06

microsoft/torchscale

A PyTorch library providing advanced foundation architectures to efficiently and effectively scale Transformers for large language models and general-purpose AI.

Core Features

DeepNet for extreme model stability, enabling Transformers with thousands of layers.
Foundation Transformers (Magneto) for general-purpose modeling across various tasks and modalities.
RetNet and LongNet, offering innovative successors to the Transformer architecture for enhanced capability and token scaling.
X-MoE for efficient and scalable sparse Mixture-of-Experts models.
DeepNorm integration to significantly improve training stability of Post-LayerNorm Transformers.

Quick Start

pip install torchscale

Detailed Introduction

TorchScale is a cutting-edge PyTorch library from Microsoft, dedicated to advancing the foundational architectures of large language models (LLMs) and general-purpose AI. It provides researchers and developers with a suite of innovative components like DeepNet for stability, Foundation Transformers (Magneto) for generality across modalities, and RetNet/LongNet for unprecedented capability and token scaling. By focusing on training stability, efficiency, and model generality, TorchScale empowers the creation of next-generation AI models that push the boundaries of scale and performance.

OSS Alternative

Explore the best open source alternatives to commercial software.

© 2026 OSS Alternative. hotgithub.com - All rights reserved.