Tags: #llm-deployment
GPU Cluster Management Platform
docker
4.9k
gpustack/gpustack
An open-source GPU cluster manager that orchestrates high-performance AI inference engines like vLLM and SGLang for efficient model deployment across diverse environments.
Technical Tutorial
Docker
2.4k
datawhalechina/handy-ollama
A comprehensive tutorial guiding users to deploy large language models locally on CPU using Ollama, making LLM inference accessible without dedicated GPU resources.