Iterative.ai has announced the latest releases of Data Version Control (DVC) and Continuous Machine Learning (CML) open source projects.

DVC and CML remove the need for proprietary AI Platforms (such as AWS SageMaker and Microsoft Azure ML Engineer) by extending traditional software tools like Git and CI/CD to meet the needs of ML Engineers.

According to the company, DVC brings agility, reproducibility, and collaboration into the existing data science workflow. DVC provides users with a Git-like interface for versioning data and models, bringing version control to machine learning and solving the challenges of reproducibility.

DVC is built on top of git, allowing users to create lightweight metafiles and enabling the system to handle large files, rather than storing them in Git. It works with remote storage for large files in the cloud or on-premise network storage.

CML is an open-source library for implementing continuous integration and delivery (CI/CD) in machine learning projects.

Users can automate parts of their development workflow, including model training and evaluation, comparing ML experiments across their project history, and monitoring changing datasets. CML will also auto-generate reports with metrics and plots in each Git pull request.

Together, CML and DVC provide ML Engineers a number of features and benefits that support data provenance, machine learning model management and automation. These include GitFlow for data science, repository & knowledge library, collaboration, and reporting.

DVC and CML are now available via GitHub and GitLab.

You may also like