The Microsoft DataPrep SDK is used to load, transform, and write data for machine learning workflows. You can interact with the SDK in any dotnetcore environment.
This project provides the following features:
Automatic delimited file type detection. The SDK can automatically detect whether your data is in any of the supported delimited file types. You don’t need to use special file readers for formats, or to specify delimiter, header, or encoding parameters.
Summary statistics can be generated quickly for a dataflow with a single line of code.
Complicated data prepare task can be archived through a serial of data flow steps.
- dotnetcore 2.1
- nuget.org access
- Visual Studio 2017
- git clone https://github.com/Azure-Samples/DataPrep.Net
- cd DataPrep.Net\Samples
- Launch DataPrepSample.sln in the sample project.
- Rebuild the DataPrepSample project.
- Run the DataPrepSample project.
- Open the result.csv under project.
- Verify the data is cleaned up.
(Any additional resources or related projects)
- Project URL (https://aka.ms/dataprep.net-ref-doc)
- ML.Net (https://dotnet.microsoft.com/apps/machinelearning-ai/ml-dotnet)