Skip to main content
Azure
  • 3 min read

Azure HBv3 virtual machines for HPC, now up to 80 percent faster with AMD Milan-X CPUs

We are announcing that a private preview is now live for Azure HBv3 virtual machines enhanced by AMD EPYC 3rd Gen processors with 3D v-cache, codenamed “Milan-X”. These processors significantly improve the performance, scaling efficiency, and cost-effectiveness of a variety of memory performance-bound workloads such as CFD, explicit finite element analysis, computational geoscience, weather simulation, and silicon design RTL workflows.

Preview live today, available globally soon

We are announcing that a preview is now live for Azure HBv3 virtual machines enhanced by 3rd Gen AMD EPYC™ processors with AMD 3D V-cache, codenamed “Milan-X”. These processors significantly improve the performance, scaling efficiency, and cost-effectiveness of a variety of memory performance-bound workloads such as CFD, explicit finite element analysis, computational geoscience, weather simulation, and silicon design RTL workflows.

Compared to the current HBv3-series with 3rd Gen AMD EPYC processors, already the highest performance VM for HPC workloads on the public cloud, customers will experience up to:

  • 80 percent higher performance for CFD
  • 60 percent higher performance for EDA RTL
  • 50 percent higher performance for explicit FEA
  • 19 percent higher performance for weather simulation

In addition, all HBv3-series VMs globally will soon be upgraded with Milan-X processors. This upgrade will be provided at no additional cost beyond existing pricing for HBv3-series VMs, and with no changes required of customer workloads. No other changes are being made to the HBv3-series VM sizes customers already know and rely on for their critical research and business workloads. For more information on the Azure HBv3-series, please see the official documentation.

Turbocharging memory performance-bound HPC workloads

Many HPC workloads are driven foremost by memory performance. For some, such as computational fluid dynamics, performance is directly driven by memory bandwidth. For others, such as RTL simulation that is the workhorse application for silicon design firms, this means memory latency. The forthcoming upgrade to HBv3-series VMs will address both needs by growing L3 cache memory to an unprecedented 1.5 gigabytes per virtual machine. This is three times larger than what is found in the standard 3rd Gen EPYC processors currently in HBv3-series VMs, and more than 25 times larger than the total L3 cache found in most HPC servers in customer datacenters today.

For memory bandwidth-bound workloads to run at an appropriate scale, the net effect of the larger L3 cache is an up to 1.8x increase in effective memory bandwidth. This means an HBv3 VM that today offers 350 GB/s (as measured by STREAM-TRIAD) will soon perform more like a VM with greater than 600 GB/s of memory bandwidth.

For memory latency-bound workloads, the net effect of the larger L3 cache in Milan-X processors is an up to 50 percent increase in cache hit rate and an overall cache latency range (latency from one core to nearest and farthest neighbor cores) that is, conversely, 50 percent lower than standard 3rd Gen EPYC processors.

Below is a sample of at-scale performance results with Azure HBv3 VMs enhanced with Milan-X processors as compared to the existing HBv3 series with standard 3rd Gen EPYC processors. Tests were conducted across a range of MPI scale scenarios, from 2 to 64 VMs (240 to 7,680 CPU cores).

An image of Relative HPC Performance showcasing a bar chart where benchmarks were taken with WRF v. 4.15, OpenFOAM v. 1912, and Ansys Fluent 2021 R1

Figure 1: Benchmarks were taken with WRF v. 4.15, OpenFOAM v. 1912, and Ansys Fluent 2021 R1

Learn additional information about performance and scalability.

The more you scale, the less you pay … no, really

Azure’s use of NVIDIA Quantum InfiniBand Networking from our partners at NVIDIA already enables customers to scale MPI workloads to supercomputer heights. Milan-X processors in HBv3-series VMs raise the bar yet again by delivering above linear scaling efficiency across a range of MPI workloads and models. This means customers actually save on Azure compute costs as a result of realizing dramatically faster time-to-solution. Below is an example of this capability in action with Ansys Fluent and the canonical F1_racerar_140m simulation. HBv3 VMs with Milan-X processors deliver nearly 200 percent scaling efficiency by yielding 127 times higher performance for only 64 VMs worth of compute.

An image of line chart with an upward trend showing the measured scaling efficiency from 1 to 64 VMs using ANSYS Fluent 2021 R1

Figure 2: Measured scaling efficiency from 1 to 64 VMs using ANSYS Fluent 2021 R1

Learn additional information about performance and scalability across a range of applications, models, and configurations.

Continuous improvement for Azure HPC customers

Microsoft and AMD share a vision for a new era of high-performance computing in the cloud: one defined by continuous improvements to the critical research and business workloads that matter most to our customers. Azure has partnered with AMD to make this vision a reality by raising the bar on the performance, scalability, and value we deliver with every release of Azure HB-series virtual machines.

Azure HPC performance 2019 through 2021

Figure 3: Azure HPC performance 2019 through 2021

We look forward to bringing Milan-X processors to all Azure HPC customers soon in the HBv3-series of virtual machines and look forward to seeing everyone in the preview.