Tags: #synthetic-data
Kiln-AI/Kiln
A free, all-in-one platform for building, evaluating, and optimizing AI systems, offering tools for RAG, agents, fine-tuning, and synthetic data generation.
bespokelabsai/curator
A Python library for generating and curating high-quality synthetic data for AI model training and structured data extraction.
nucleuscloud/neosync
An open-source platform for developers to anonymize sensitive production data, generate synthetic data, and sync environments for secure testing and improved developer experience.
argilla-io/distilabel
Distilabel is a framework for generating synthetic data and AI feedback, enabling engineers to build fast, reliable, and scalable AI pipelines based on verified research.
NVIDIA-NeMo/DataDesigner
A flexible framework by NVIDIA NeMo for generating high-quality synthetic datasets with diverse distributions, meaningful correlations, and robust validation.