treeverse/dvc
DVC (Data Version Control) is a command-line tool and VS Code extension for managing data, models, and ML experiments, enabling reproducible machine learning projects.
Core Features
Quick Start
pip install dvcDetailed Introduction
DVC (Data Version Control) is an open-source MLOps tool designed to bring Git-like version control to data and machine learning models. It enables data scientists and ML engineers to build reproducible projects by tracking data artifacts, defining lightweight pipelines, and managing experiments directly within their Git repositories. DVC integrates seamlessly with cloud storage solutions, allowing users to version large datasets and models efficiently, compare different experiment runs, and easily share their work, fostering collaboration and ensuring the reliability of ML workflows.