OSS Alternative - Discover Top Open Source Alternatives to Popular Software

AI-Hypercomputer/maxtext

A high-performance, scalable JAX-based open-source library for training large language models on Google Cloud TPUs and GPUs.

Core Features

High-performance and scalable LLM training on TPUs/GPUs.

Supports popular LLM architectures like Gemma, Llama, DeepSeek, Qwen, and Mistral.

Facilitates pre-training and advanced post-training techniques (SFT, GRPO, GSPO).

Achieves high Model FLOPs Utilization (MFU) through JAX and XLA compiler.

Offers a 'decoupled mode' for local execution without GCP dependencies.

Quick Start

pip install maxtext

Detailed Introduction

MaxText is an open-source, high-performance, and highly scalable library written in pure Python and JAX, specifically designed for training large language models (LLMs) on Google Cloud TPUs and GPUs. It offers a robust framework for both pre-training and advanced post-training techniques like Supervised Fine-Tuning (SFT) and reinforcement learning methods (GRPO, GSPO). Leveraging the power of JAX and the XLA compiler, MaxText achieves exceptional Model FLOPs Utilization and tokens/second throughput across various scales, from single hosts to massive clusters, while maintaining simplicity. It serves as a foundational platform for developing ambitious LLM projects in both research and production environments.