Machine Learning model training with AKS
Training of models using large datasets is a complex and resource intensive task. Use familiar tools such as TensorFlow and Kubeflow to simplify training of Machine Learning models. Your ML models will run in AKS clusters backed by GPU enabled VMs.
Package ML model into a container and publish to ACR
Azure Blob storage hosts training data sets and trained model
Use Kubeflow to deploy training job to AKS, distributed training job to AKS includes Parameter servers and Worker nodes
Serve production model using Kubeflow, promoting a consistent environment across test, control and production
AKS supports GPU enabled VM
Developer can build features querying the model running in AKS cluster
- 1 Package ML model into a container and publish to ACR
- 2 Azure Blob storage hosts training data sets and trained model
- 3 Use Kubeflow to deploy training job to AKS, distributed training job to AKS includes Parameter servers and Worker nodes
- 4 Serve production model using Kubeflow, promoting a consistent environment across test, control and production
- 5 AKS supports GPU enabled VM
- 6 Developer can build features querying the model running in AKS cluster