Ecosystem & Stack: cuda

llm serving kvcache disaggregated architecture

5.1k

kvcache-ai/Mooncake

A KVCache-centric disaggregated architecture for high-performance LLM serving, powering leading AI services.

Deep Learning Adaptation Framework

segmentation computer-vision pytorch

1.5k

tianrun-chen/SAM-Adapter-PyTorch

A PyTorch-based framework to adapt Meta AI's Segment Anything Model (SAM) for improved performance on challenging downstream computer vision tasks using adapters and prompts.

multimodal-llm deep-learning speech-processing

Deep Learning Toolkit / Multimodal LLM Framework

linux

1.0k

X-LANCE/SLAM-LLM

A deep learning toolkit for training custom multimodal large language models focused on speech, language, audio, and music processing.

AI/ML Finetuning UI

2.1k

lxe/simple-llm-finetuner

A beginner-friendly UI for fine-tuning language models using LoRA on commodity NVIDIA GPUs, though the project is no longer actively maintained.

llm-finetuning lora peft

AI/ML Model Finetuning Framework

3.8k

mymusise/ChatGLM-Tuning

A cost-effective solution for fine-tuning ChatGLM-6B using LoRA, enabling personalized large language models.

chatglm lora finetuning

Replaces:

ChatGPT

llm-inference lora gpu-optimization

LLM Inference Server

Docker

3.8k

A multi-LoRA inference server designed to efficiently serve thousands of fine-tuned Large Language Models on a single GPU, drastically cutting serving costs while maintaining high throughput and low latency.

diffusion-models generative-ai pytorch

Machine Learning Library

pytorch

33.4k

huggingface/diffusers

A modular PyTorch library for state-of-the-art diffusion models, enabling easy generation of images, audio, and more.

comfyui video-generation ai-model

ComfyUI Custom Nodes / AI Video Generation Plugin

comfyui

3.5k

Lightricks/ComfyUI-LTXVideo

Extends ComfyUI with advanced custom nodes for the LTX-2 video generation model, enabling powerful text-to-video and image-to-video workflows.

CLI Tool

3.2k

SamurAIGPT/AI-Youtube-Shorts-Generator

Automates YouTube Shorts generation from long videos using AI for highlights, subtitles, and vertical cropping.

python ai video-editing

Content Creation Tool

text-to-speech audiobook epub

4.3k

denizsafak/abogen

Generate high-quality audiobooks and voiceovers from various text formats with synchronized captions.

AI Voice Synthesis Web Application

voice cloning text-to-speech few-shot learning

56.8k

RVC-Boss/GPT-SoVITS

A powerful open-source web UI for few-shot voice conversion and text-to-speech, enabling high-quality voice cloning with minimal audio data.

Replaces:

Commercial Voice Cloning Services Commercial TTS APIs

Multimodal AI Inference and Serving Framework

multimodal-ai model-serving inference-framework

4.4k

vllm-project/vllm-omni

vLLM-Omni is an efficient, flexible, and easy-to-use framework extending vLLM to serve omni-modality models (text, image, video, audio) with high throughput and an OpenAI-compatible API.

LLM Fine-tuning Framework

2.8k

liucongg/ChatGLM-Finetuning

A comprehensive toolkit for fine-tuning ChatGLM-6B, ChatGLM2-6B, and ChatGLM3-6B models using various methods like Freeze, Lora, P-tuning, and full parameter fine-tuning.

chatglm finetuning llm

ai voice tts voice cloning

AI Multimedia Processing Web Application

CUDA

6.6k

abus-aikorea/voice-pro

A powerful AI-powered web application for comprehensive multimedia content creation, offering advanced speech recognition, voice cloning, multilingual TTS, and YouTube video processing.

Replaces:

ElevenLabs

Local Web Interface for Text-to-Speech

chattts text-to-speech web-ui

7.5k

jianchang512/ChatTTS-ui

Provides a local web interface and API for the ChatTTS model, enabling text-to-speech synthesis with support for mixed languages and numbers.

Replaces:

Cloud Text-to-Speech Services

AI Voice Cloning and Synthesis Tool

voice cloning text-to-speech speech-to-speech

9.0k

jianchang512/clone-voice

A user-friendly web-based tool for voice cloning, text-to-speech, and speech-to-speech conversion, leveraging the Coqui XTTS_v2 model with multi-language support.

Audio Synthesis Framework

singing-voice-synthesis text-to-speech diffusion-models

4.8k

MoonInTheRiver/DiffSinger

DiffSinger is an official PyTorch implementation of a singing voice synthesis (SVS) and text-to-speech (TTS) system, leveraging a shallow diffusion mechanism for high-quality audio generation.

AI/ML Model Implementation

tts voice cloning multilingual

8.0k

Plachtaa/VALL-E-X

An open-source implementation of Microsoft's VALL-E X, enabling zero-shot multilingual text-to-speech synthesis and voice cloning with emotion control.

Replaces:

ElevenLabs Google Cloud Text-to-Speech...

AI/ML Library

.net

3.6k

SciSharp/LLamaSharp

A cross-platform C#/.NET library for efficient local inference of large language models (LLMs) like LLaMA and LLAVA.

llm llama csharp

ai-art-generation stable-diffusion web-ui

AI Generative WebUI

stable diffusion

7.1k

vladmandic/sdnext

An all-in-one open-source WebUI for AI generative image and video creation, captioning, and processing, built on Stable Diffusion.

generative ai stable diffusion ai tutorials

Generative AI Educational Resource Hub

google colab

2.7k

FurkanGozukara/Stable-Diffusion

A comprehensive repository offering expert-level tutorials, guides, and courses on various Generative AI technologies, primarily focusing on Stable Diffusion and its ecosystem.

AI Video Generation Tool

stable-diffusion ai-video-generation text-to-video

4.7k

nateraw/stable-diffusion-videos

Create dynamic videos by smoothly transitioning between text prompts using Stable Diffusion's latent space exploration.

AI-powered 3D Generation Node Suite

ComfyUI

3.7k

MrForExample/ComfyUI-3D-Pack

An extensive node suite that integrates cutting-edge 3D generation algorithms and models into ComfyUI, enabling seamless processing of 3D inputs like meshes and UV textures.

comfyui 3d-generation ai