Customers working with Azure Machine Learning models have been leveraging the built in AzureMLBatchExecution activity with Azure Data Factory pipelines to operationalize the ML models in production and score new data against the pre-trained models at scale. But as trends and variables that influence the model’s parameters change over time, ideally this pipeline should also support recurring automated retraining and updates to the model with latest training data. Now Azure Data Factory allows you to do just that with the newly released AzureMLUpdateResource activity.
With Azure ML you typically first setup your scoring and training experiments, then two separate web service endpoints for each experiment. Next, you can use the AzureMLBatchExecution activity with Data Factory to do both scoring of incoming data against the latest model hosted by the scoring web service and scheduled retraining with latest training data. The scoring web service endpoint also exposes an Update Resource method that can be used to update the model used by the scoring web service. This is where the new AzureMLUpdateResource activity comes into picture. You can use this activity now to take the model generated by the training activity and provide it to the scoring web service to update the model for scoring, on a schedule, all automated with your existing data factory pipeline.
Setting up the experiment endpoints
Below is an overview of the relationship between training and scoring endpoints in Azure ML. Both originate from an experiment in Azure ML Studio, and both are available as Batch Execution Services. A training web service receives training data and produces trained model(s). A scoring web service receives unlabeled data examples and makes predictions.
To create the retraining and updating scenario, follow these general steps:
- Create your experiment in Azure ML Studio.
- When you are satisfied with your model, use Azure ML Studio to publish web services for both the training experiment and the scoring experiment.
- The scoring web service endpoint is used to make predictions about new data examples. The output of prediction could have various forms, such as a .csv file or rows in Azure SQL databases, depending on the configuration of the experiment.
- The training web service is used to generate new, trained models from new training data. The output of retraining is a .ilearner file in Azure Blob storage.
For detailed instructions on creating web service endpoints for retraining, refer to our documentation.
You can view the Web service endpoints in Azure Management Portal. These will be referenced in ADF Linked Services and Activities.
Retraining and updating an Azure ML model using ADF
The operationalized retraining and updating scenario in ADF consists of the following elements:
- One or more storage Linked Services and Datasets for the training data, matching the storage type to the training experiment input. For example, if the experiment is configured with a Web Service input, then the training data should come from Azure Blob storage. If your data will be pulled from Azure SQL table, the experiment should be configured with a Reader Module. In your scenario, the training input might be produced by an ADF activity, such as a Hadoop process or Copy Activity. Or it might be generated by some external process.
- One or more storage Linked Services and Azure Blob Datasets to receive the newly trained model .ilearner file(s).
- AzureML Linked Service and AzureMLBatchExecution Activity to call the training web service endpoint. The training Datasets will be the inputs and the ilearner Dataset the output of this Activity.
- AzureML Linked Service and AzureMLUpdateResource Activity for the scoring experiment endpoint to be updated. An ilearner Dataset will be the input to this Activity. If there are multiple models to be updated, there will be one Activity for each. If there are multiple endpoints to be updated, there will be a Linked Service and Activity for each.
- Storage Linked Service and Dataset for the Activity output. The Azure ML Update Resource API call does not generate any output, but today in ADF, an output dataset is required to drive the Pipeline schedule.
For more details on creating Datasets and LinkedServices, see:
- Creating Data Factory Datasets
- Data movement activities
- Transformation activities in Azure Data Factory
For step-by-step instructions on setting up the scoring, retraining and update activities for the model complete with JSON examples, check out out our predictive pipelines documentation.
Summary
In this blog, I have presented an end-to-end scenario for retraining and updating Azure ML web service models.
The Data Factory and Azure ML teams would like to hear from you about your scenarios for operationalizing Azure ML in ADF. In particular, we would like to gather examples of:
- How training data is produced and how it would be ingested into the ADF pipelines
- How retrained models should be evaluated before being put into production
- Scenarios in which retrained model evaluation is or is not required
- Scenarios in which other Azure ML endpoint management (creation, deletion) would be part of an operational pipeline
Please leave comments or post to the Data Factory forum.