Tags: #nlp
huggingface/transformers
A unified framework providing state-of-the-art machine learning models for text, vision, audio, and multimodal tasks, optimized for both inference and training.
web-infra-dev/midscene
An AI-powered, vision-driven UI automation framework for every platform, enabling natural language control and scripting.
neuml/txtai
An all-in-one AI framework for semantic search, LLM orchestration, and language model workflows, built around an embeddings database.
explosion/spaCy
An industrial-strength Python library for advanced Natural Language Processing, designed for building real-world applications.
llmware-ai/llmware
A unified framework for building local, private, and secure enterprise RAG pipelines using small, specialized LLMs optimized for on-device and edge deployment.
FlagOpen/FlagEmbedding
A comprehensive toolkit providing state-of-the-art embedding and reranker models for efficient information retrieval and Retrieval-Augmented Generation (RAG) applications.
microsoft/graphrag
GraphRAG is a modular, graph-based Retrieval-Augmented Generation (RAG) system that leverages LLMs to extract structured data from unstructured text, enhancing reasoning on private datasets.
luhengshiwo/LLMForEverybody
A comprehensive learning platform offering structured knowledge, interview questions, and paper analysis for Large Language Models (LLMs).
argilla-io/argilla
Argilla is a collaboration tool for AI engineers and domain experts to build and maintain high-quality datasets for various AI models, from NLP to LLMs and multimodal systems.
NirDiamant/RAG_Techniques
A comprehensive repository showcasing advanced Retrieval-Augmented Generation (RAG) techniques through detailed, practical notebook tutorials.
google/langextract
A Python library leveraging LLMs to extract structured information from unstructured text with precise source grounding and interactive visualization.
huggingface/datasets
A lightweight library providing a vast hub of ready-to-use datasets and efficient tools for data manipulation in AI and machine learning workflows.
athina-ai/rag-cookbooks
A comprehensive repository offering practical implementations and evaluation guidance for advanced and agentic Retrieval-Augmented Generation (RAG) techniques.
eosphoros-ai/DB-GPT-Hub
A specialized hub providing models, datasets, and fine-tuning techniques to enhance Large Language Models' performance in Text-to-SQL, Text-to-NLU, and Text-to-GQL tasks.
X-LANCE/SLAM-LLM
A deep learning toolkit for training custom multimodal large language models focused on speech, language, audio, and music processing.
zyds/transformers-code
A comprehensive code repository accompanying a hands-on course for mastering Huggingface Transformers, covering fundamental concepts to advanced fine-tuning and deployment techniques.
mymusise/ChatGLM-Tuning
A cost-effective solution for fine-tuning ChatGLM-6B using LoRA, enabling personalized large language models.
adapter-hub/adapters
A unified library for parameter-efficient and modular transfer learning, extending HuggingFace Transformers with various adapter methods.
argilla-io/distilabel
Distilabel is a framework for engineers to build fast, reliable, and scalable pipelines for synthetic data generation and AI feedback, based on verified research.
huggingface/alignment-handbook
Provides robust recipes and training code to align language models with human and AI preferences, enhancing helpfulness and safety.
AI4Finance-Foundation/FinGPT
FinGPT is an open-source initiative providing cost-effective and rapidly adaptable large language models specifically designed for the dynamic financial sector.
promptslab/Promptify
A Python library for task-based NLP using LLMs, providing structured outputs, universal LLM backend support, and built-in evaluation.
snakers4/silero-models
Silero Models offers a collection of pre-trained, end-to-end text-to-speech models designed for simplicity, speed, and natural-sounding speech generation.
embeddings-benchmark/mteb
MTEB is a comprehensive benchmark and evaluation framework designed to assess the performance of text embedding models and retrieval systems across a wide range of tasks.
dbiir/UER-py
UER-py is an open-source PyTorch-based framework for pre-training and fine-tuning NLP models, offering modularity, extensibility, and a comprehensive model zoo.
liucongg/ChatGLM-Finetuning
A comprehensive toolkit for fine-tuning ChatGLM-6B, ChatGLM2-6B, and ChatGLM3-6B models using various methods like Freeze, Lora, P-tuning, and full parameter fine-tuning.
Yutong-Zhou-cv/Awesome-Text-to-Image
A comprehensive curated list of resources, papers, datasets, and projects focused on text-to-image generation and manipulation.
microsoft/unilm
A comprehensive research initiative by Microsoft focusing on large-scale self-supervised pre-training to develop advanced foundation models across diverse tasks, languages, and modalities.
X-PLUG/mPLUG-Owl
A family of powerful multi-modal large language models (MLLMs) designed to advance AI's understanding and generation capabilities across various data types.
OpenGVLab/InternVL
A pioneering open-source multimodal AI model family designed to serve as a high-performance alternative to commercial models like GPT-4o and GPT-5.