Azure Machine Learning announces output dataset (Preview)
Published date: August 20, 2020
Datasets in Azure Machine Learning can help read data in the cloud in a secure manner, with capabilities like versioning and lineage for tracking and audit.
Datasets create a reference to the data source location, along with a copy of the metadata, so that no extra storage costs are incurred, and the integrity of the data sources themselves is not at risk. Once a dataset is created, you can load the dataset into common dataframe or mount and download files to a compute target.
With the new output datasets capability, you can write back to cloud storage including Blob, ADLS Gen 1, ADLS Gen 2, and FileShare. You can configure where to output data, how to output data (via mount or upload), and whether to register the output data for future reuse and sharing. This enables reproducibility, sharing, prevents duplication of data, and results in cost efficiency and productivity gains.