xorbitsai/inference
A unified, production-ready inference API for deploying and serving open-source language, speech, and multimodal AI models on various infrastructures.
Core Features
Detailed Introduction
Xorbits Inference (Xinference) is a powerful open-source library designed to simplify the deployment and serving of a wide range of AI models, including large language models, speech recognition, and multimodal models. It provides a unified, production-ready API that allows developers, researchers, and data scientists to effortlessly run models on various infrastructures—from local laptops to cloud environments. By abstracting away the complexities of model serving, Xinference enables users to quickly leverage cutting-edge AI capabilities, offering features like automatic batching, distributed inference, and integration with agent platforms for advanced AI applications.