AI + Machine Learning, Azure Cognitive Search, Azure Machine Learning, Internet of Things, Partners

Meta selects Azure as strategic cloud provider to advance AI innovation and deepen PyTorch collaboration

By Eric Boyd Corporate Vice President, Azure AI Platform, Microsoft

Meta selects Azure as strategic cloud provider to advance AI innovation and deepen PyTorch collaboration • 2 min read

Posted on May 25, 2022
2 min read

Microsoft is committed to the responsible advancement of AI to enable every person and organization to achieve more. Over the last few months, we have talked about advancements in our Azure infrastructure, Azure Cognitive Services, and Azure Machine Learning to make Azure better at supporting the AI needs of all our customers, regardless of their scale. Meanwhile, we also work closely with some of the leading research organizations around the world to empower them to build great AI.

Today, we’re thrilled to announce an expansion of our ongoing collaboration with Meta: Meta has selected Azure as a strategic cloud provider to help accelerate AI research and development.

As part of this deeper relationship, Meta will expand its use of Azure’s supercomputing power to accelerate AI research and development for its Meta AI group. Meta will utilize a dedicated Azure cluster of 5400 GPUs using the latest virtual machine (VM) series in Azure (NDm A100 v4 series, featuring NVIDIA A100 Tensor Core 80GB GPUs) for some of their large-scale AI research workloads. In 2021, Meta began using Microsoft Azure Virtual Machines (NVIDIA A100 80GB GPUs) for some of its large-scale AI research after experiencing Azure’s impressive performance and scale. With four times the GPU-to-GPU bandwidth between virtual machines compared to other public cloud offerings, the Azure platform enables faster distributed AI training. Meta used this, for example, to train their recent OPT-175B language model. The NDm A100 v4 VM series on Azure also gives customers the flexibility to configure clusters of any size automatically and dynamically from a few GPUs to thousands, and the ability to pause and resume during experimentation. Now, the Meta AI team is expanding their usage and bringing more cutting-edge machine learning training workloads to Azure to help further advance their leading AI research.

In addition, Meta and Microsoft will collaborate to scale PyTorch adoption on Azure and accelerate developers' journey from experimentation to production. Azure provides a comprehensive top to bottom stack for PyTorch users with best-in-class hardware (NDv4s and Infiniband). In the coming months, Microsoft will build new PyTorch development accelerators to facilitate rapid implementation of PyTorch-based solutions on Azure. Microsoft will also continue providing enterprise-grade support for PyTorch to enable customers and partners to deploy PyTorch models in production on both cloud and edge.

“We are excited to deepen our collaboration with Azure to advance Meta’s AI research, innovation, and open-source efforts in a way that benefits more developers around the world,” Jerome Pesenti, Vice President of AI, Meta. “With Azure’s compute power and 1.6 TB/s of interconnect bandwidth per VM we are able to accelerate our ever-growing training demands to better accommodate larger and more innovative AI models. Additionally, we’re happy to work with Microsoft in extending our experience to their customers using PyTorch in their journey from research to production.”

By scaling Azure’s supercomputing power to train large AI models for the world’s leading research organizations, and by expanding tools and resources for open source collaboration and experimentation, we can help unlock new opportunities for developers and the broader tech community, and further our mission to empower every person and organization around the world.

Meta selects Azure as strategic cloud provider to advance AI innovation and deepen PyTorch collaboration

Explore

Related posts

Microsoft and NVIDIA partnership continues to deliver on the promise of AI

Microsoft and Mistral AI announce new partnership to accelerate AI innovation and introduce Mistral Large first on Azure

Explore cutting-edge AI solutions with Microsoft at NVIDIA GTC

Achieve generative AI operational excellence with the LLMOps maturity model

Popular

AI + machine learning

Analytics

Compute

Containers

Databases

DevOps

Developer tools

Hybrid + multicloud

Identity

Integration

Internet of Things

Management and governance

Media

Migration

Mixed reality

Mobile

Networking

Security

Storage

Web

Virtual desktop infrastructure

Use cases

Application development

AI

Cloud migration and modernization

Data and analytics

Hybrid cloud and infrastructure

Internet of Things

Security and governance

Organization type

Resources

Explore

Related posts