Tags: #llm-deployment

GPU Cluster Management Platform

4.9k

gpustack/gpustack

An open-source GPU cluster manager that orchestrates high-performance AI inference engines like vLLM and SGLang for efficient model deployment across diverse environments.

gpu-management ai-inference llm-deployment

Details

Technical Tutorial

Docker

2.4k

datawhalechina/handy-ollama

A comprehensive tutorial guiding users to deploy large language models locally on CPU using Ollama, making LLM inference accessible without dedicated GPU resources.

ollama llm-deployment cpu-inference

Details