Tags: #synthetic-data
AI/ML Data Curation Library
python
1.7k
bespokelabsai/curator
A Python library for generating and curating high-quality synthetic data for AI model training and structured data extraction.
Data Privacy & Development Platform
Docker
4.2k
nucleuscloud/neosync
An open-source platform for developers to anonymize PII, generate synthetic data, and sync environments, enabling secure testing and compliance.
AI Data Framework
3.2k
argilla-io/distilabel
Distilabel is a framework for engineers to build fast, reliable, and scalable pipelines for synthetic data generation and AI feedback, based on verified research.
AI/ML Synthetic Data Generation Framework
Python
1.6k
NVIDIA-NeMo/DataDesigner
A flexible framework by NVIDIA NeMo for generating high-quality synthetic datasets with diverse distributions, meaningful correlations, and robust validation.