Tags: #nlp
huggingface/transformers
A comprehensive library providing state-of-the-art pre-trained models for various machine learning tasks across text, vision, audio, and multimodal domains, facilitating both inference and training.
neuml/txtai
An all-in-one AI framework for semantic search, LLM orchestration, and language model workflows, powered by an embeddings database.
explosion/spaCy
An industrial-strength Python library for advanced Natural Language Processing, offering state-of-the-art models and a production-ready training system.
microsoft/graphrag
A modular graph-based Retrieval-Augmented Generation (RAG) system designed to extract structured data from unstructured text using LLMs to enhance reasoning on private data.
argilla-io/argilla
Argilla is an open-source collaboration tool for AI engineers and domain experts to build and manage high-quality datasets for various AI models, leveraging human feedback and programmatic workflows.
NirDiamant/RAG_Techniques
A repository showcasing various advanced Retrieval-Augmented Generation (RAG) techniques through detailed notebook tutorials.
google/langextract
A Python library leveraging LLMs to extract structured information from unstructured text with precise source grounding and interactive visualization.
huggingface/datasets
A lightweight library providing one-line dataloaders and efficient pre-processing tools for a vast hub of AI datasets, supporting various ML frameworks.
athina-ai/rag-cookbooks
A comprehensive repository offering practical implementations and evaluation guidance for advanced and agentic Retrieval-Augmented Generation (RAG) techniques.
eosphoros-ai/DB-GPT-Hub
A specialized hub providing models, datasets, and fine-tuning techniques to enhance Large Language Models' performance in Text-to-SQL, Text-to-NLU, and Text-to-GQL tasks.
X-LANCE/SLAM-LLM
A deep learning toolkit for training custom multimodal large language models focused on speech, language, audio, and music processing.
zyds/transformers-code
A comprehensive code repository accompanying a hands-on course for mastering Huggingface Transformers, covering fundamental concepts to advanced fine-tuning and deployment techniques.
mymusise/ChatGLM-Tuning
A cost-effective solution for finetuning ChatGLM-6B with LoRA, enabling personalized large language models.
adapter-hub/adapters
A unified library extending HuggingFace Transformers for parameter-efficient and modular transfer learning in NLP.
promptslab/Promptify
A Python library for structured NLP tasks using LLMs, offering Pydantic outputs, multi-provider support, and built-in evaluation.
embeddings-benchmark/mteb
MTEB is a comprehensive benchmark and evaluation framework designed to assess the performance of text embedding models and retrieval systems across a wide range of tasks.
dbiir/UER-py
An open-source PyTorch-based framework for NLP pre-training and fine-tuning, offering modularity, reproducibility, and a comprehensive model zoo for various downstream tasks.
liucongg/ChatGLM-Finetuning
A toolkit for finetuning ChatGLM series models (ChatGLM-6B, ChatGLM2-6B, ChatGLM3-6B) using various methods like Freeze, Lora, P-tuning, and full parameter training for downstream NLP tasks.
Yutong-Zhou-cv/Awesome-Text-to-Image
A comprehensive curated list of resources, papers, datasets, and projects related to text-to-image generation and manipulation.
microsoft/unilm
A comprehensive research hub for large-scale self-supervised pre-training of foundation models across diverse tasks, languages, and modalities.
X-PLUG/mPLUG-Owl
A family of powerful multi-modal large language models (MLLMs) designed to advance AI's understanding and generation capabilities across various data types.
OpenGVLab/InternVL
A pioneering open-source multimodal large language model family aiming to match or exceed commercial models like GPT-4o/GPT-5 in performance.
FireRedTeam/FireRed-OpenStoryline
FireRed-OpenStoryline is an AI video editing agent that transforms manual editing into intention-driven directing through natural language interaction and LLM-powered planning.
IDEA-CCNL/Fengshenbang-LM
Fengshenbang-LM is an open-source large model system by IDEA Research Institute, serving as infrastructure for Chinese AIGC and cognitive intelligence.