Automating machine learning workflows to infuse AI in Visual Studio

See how data scientists and engineers in the Microsoft developer division turned a successful experiment into a high-traffic product feature with machine learning operations (MLOps) practices.

The challenge: From prototype to production at scale

After six months of AI and machine learning experiments aimed at improving developer productivity, a small team of applied data scientists in the Microsoft developer division arrived at a model that actively predicted the C# methods a developer was likely to call as they were coding.

This successful machine learning prototype would become the basis for Visual Studio IntelliCode, an AI-assisted code prediction capability—but not before it underwent rigorous quality, availability, and scaling tests to meet the requirements of Visual Studio users. They would need to invite the engineering team to create a machine learning platform and automate that process. And both teams would need to adopt an MLOps culture—extending DevOps principles to the end-to-end machine learning lifecycle.

Together, the applied science and engineering teams built a machine learning pipeline to iterate on the model training process and automate much of the work the applied science team did manually in the prototype stage. That pipeline allowed IntelliCode to scale and support 6 programming languages, regularly training new models using code examples from an extensive set of open-source GitHub repos.

"Clearly, we were going to be doing a lot of compute-intensive model training on very large data sets every month—making the need for an automated, scalable, end-to-end machine learning pipeline all that more evident."

Gearard Boland, Principal Software Engineering Manager, Data and AI team

Capitalizing on insights with MLOps

As IntelliCode rolled out, the teams saw an opportunity to design an even better user experience: creating team completion models based on each customer’s specific coding habits. Personalizing those machine learning models would require training and publishing models on demand automatically—whenever a Visual Studio or Visual Studio Code user requests it. To perform those functions at scale using with the existing pipeline, the teams used Azure services such as Azure Machine Learning, Azure Data Factory, Azure Batch, and Azure Pipelines.

"When we added support for custom models, the scalability and reliability of our training pipeline became even more important"

Gearard Boland, Principal Software Engineering Manager, Data and AI team

Bringing two different perspectives together

To build their machine learning pipeline, the teams had to define common standards and guidelines so that they could speak a common language, share best practices, and collaborate better. They also had to understand each other’s approaches to the project. While the data science team worked experimentally—iterating quickly on model creation—the engineering team focused on ensuring IntelliCode met Visual Studio users’ expectations for production-level features.

Today, the entire machine learning pipeline—training, evaluation, packaging, and deployment—runs automatically and serves more than 9,000 monthly model creation requests from Visual Studio and Visual Studio Code users. The teams are looking for ways to use their pipeline to build additional AI capabilities into other Microsoft products and provide even richer experiences to customers.

See how the teams implemented MLOps step by step.

Read the full story