Azure high-performace computing at SC’19

Posted on 19 November, 2019

Principal Program Manager, Azure HPC

HBv2 Virtual Machines for HPC, Azure’s most powerful yet, now in preview

Azure HB v2-series Virtual Machines (VM) for high-performance computing (HPC) are now in preview in the South Central US region.

HBv2-series Virtual Machines are Azure’s most advanced HPC offering yet, featuring performance and Message Passing Interface scalability rivaling the most advanced supercomputers on the planet, and price and performance on par with on-premises HPC deployments.

HBv2 Virtual Machines are designed for a variety of real-world HPC applications, from fluid dynamics to finite element analysis, molecular dynamics, seismic processing & imaging, weather modeling, rendering, computational chemistry, and more.

Each HBv2 Virtual Machines features 120 AMD EPYCTM 7742 processor cores at 2.45 GHz (3.3 GHz Boost), 480 GB of RAM, 480 MB of L3 cache, and no simultaneous multithreading. A HBv2 Virtual Machine also provides up to 340 GB per second of memory bandwidth, up to four teraflops of double-precision compute, and up to eight teraflops of single-precision compute.

Finally, a HBv2 Virtual Machine features 900 GB of low-latency, high-bandwidth block storage via NVMeDirect, and supports up to eight Azure Managed Disks.

200 Gigabits high data rate (HDR) InfiniBand comes to the Azure

HBv2-series Virtual Machines feature one of the cloud’s first deployment of 200 Gigabit per second HDR InfiniBand networking from Mellanox, which provides up to 8 times higher bandwidth and 16 times lower latencies than found elsewhere on the public cloud.

With HBv2 Virtual Machines, Azure is also introducing two new network features to support the highest sustained performance for tightly-coupled workloads. The first is adaptive routing, which helps optimize Message Passing Interface performance on congested networks. The second is support for dynamic connected transport (DCT) which provides reliable transport, and enhancements to scalable, asynchronous, and high-performance communication.

As with HB and HC Virtual Machines, HBv2 Virtual Machines support hardware-based offload for Message Passing Interface collectives.

Azure & Cray deliver cloud-based seismic imaging at 28,000 cores, 42 GB per second reads, and 62 GB per second write performance

Customers come to Azure for our ability to support their largest and most critical workloads. Energy companies have been among the first and most eager to embrace our advanced HPC capabilities, including for their core subsurface discovery workloads. Advances in subsurface computing support more accurate identification of energy resources, as well as safer extraction of these resources from challenging areas such as beneath thick deposits of salt in the Gulf of Mexico.

As part of our work with one of our strategic partner operators energy exploration customers, today we are sharing that Azure recently supported what is believe to be one of the largest cloud-based seismic processing workload yet.

Powered by up to 468 Azure HB Virtual Machines totaling 28,080 AMD EPYC first generation CPU cores and more than 123 terabyte per second of aggregate memory bandwidth, the customer was able to run imaging jobs utilizing a variety of pre-stack and post-stack migration, full-waveform inversion, and real-time migration techniques.

Seismic imaging is as much about data movement as it is compute, however, to support this record scale customer workload Cray provided the supercomputing firm’s vaunted ClusterStor storage system. Announced earlier this year, Cray® ClusterStor™ in Azure is a dedicated Lustre filesystem solution to accelerate data processing for the largest and most complex HPC and AI jobs run on Azure, and can optionally be connected to Azure H-series Virtual Machines. Not only does Cray ClusterStor in Azure leverage the same technology that powers many of the fastest HPC filesystems on the planet, it also is among the most affordable on the cloud. Over a typical three-year reserved instance period, Cray ClusterStor in Azure can cost as little as 1/10th of Lustre offerings found on other public clouds.

The combination of the Azure HB-series Virtual Machines and Cray ClusterStor provided a highly scalable solution as delivering an 11.5x improvement in time to solution as the pool of compute virtual machines was increased from 16 to 400.

The Cray ClusterStor in Azure storage solution, whose measured performance peaked at 42 GB per second (reads) and 62 GB per second (writes) also delivered significant differentiation for the customer by driving a 66 percent improvement in application performance as compared to an alternative, high-performance network file system (NFS) approach.

Available now

Azure HBv2-series Virtual Machines are currently available in South Central US, with additional regions rolling out soon.