3 min read
Announcing the second-generation HB-series Azure Virtual Machines for high-performance computing (HPC). HBv2 Virtual Machines are designed to deliver leadership-class performance, message passing interface (MPI) scalability, and cost efficiency for a variety of real-world HPC workloads.
HBv2 Virtual Machines feature 120 AMD EPYC™ 7002-series CPU cores, 480 GB of RAM, 480 MB of L3 cache, and no simultaneous multithreading (SMT). HBv2 Virtual Machines provide up to 350 GB/sec of memory bandwidth, which is 45-50 percent more than comparable x86 alternatives and three times faster than what most HPC customers have in their datacenters today.
|Memory per CPU Core: GB
|Local SSD: GiB
‘r’ denotes support for RDMA. ‘s’ denotes support for Premium SSD disks.
Each HBv2 virtual machine (VM) also features up to 4 teraFLOPS of double-precision performance, and up to 8 teraFLOPS of single-precision performance. This is a four times increase over our first generation of HB-series Virtual Machines, and substantially improves performance for applications demanding the fastest memory and leadership-class compute density.
Below are preliminary benchmarks on HBv2 across several common HPC applications and domains:
To drive optimal at-scale message passing interface (MPI) performance, HBv2 Virtual Machines feature 200 Gb/s HDR InfiniBand from our technology partners at Mellanox. The InfiniBand fabric backing HBv2 Virtual Machines is a non-blocking fat-tree with a low-diameter design for consistent, ultra-low latencies. Customers can use standard Mellanox/OFED drivers just as they would on a bare metal environment. HBv2 Virtual Machines officially support RDMA verbs and hence support all InfiniBand based MPIs, such as OpenMPI, MVAPICH2, Platform MPI, and Intel MPI. Customers can also leverage hardware offload of MPI collectives to realize additional performance, as well as efficiency gains for commercially licensed applications.
Across a single virtual machine scale set, customers can run a single MPI job on HBv2 Virtual Machines at up to 36,000 cores. For our largest customers, HBv2 Virtual Machines support up to 80,000 cores for single jobs.
Customers can also maximize the Ethernet interface of HBv2 Virtual Machines by using the SRIOV-based accelerated networking in Azure, which will yield up to 40 Gb/s of bandwidth, consistent, and low latencies.
Finally, the new H-series Virtual Machines feature local NVMe SSDs to deliver ultra-fast temporary storage for the full range of file sizes and I/O patterns. Using modern burst-buffer technologies like BeeGFS BeeOND, the new H-series Virtual Machines can deliver more than 900 GB/sec of peak injection I/O performance across a single virtual machine scale set. The new H-series Virtual Machines will also support Azure Premium SSD disks.
Customers can accelerate their HBv2 deployments with a variety resources optimized and pre-configured by the Azure HPC team. Our pre-built HPC image for CentOS is tuned for optimal performance and bundles key HPC tools like various MPI libraries, compilers, and more. The AzureHPC Project helps customers deploy an end-to-end Azure HPC environment reliably and quickly, and includes deployment scripts for setting up building blocks for networking, compute, schedulers, and storage. Also included is a growing list of tutorials for running HPC applications themselves.
For customers familiar with HPC schedulers and who would like to use these with HBv2 Virtual Machines, Azure CycleCloud is the simplest way to orchestrate autoscaling clusters. Azure CycleCloud supports schedulers such as Slurm, PBSPro, LSF, GridEngine, and HTCondor, and enables hybrid deployments for customers wishing to pair HBv2 Virtual Machines with their existing on-premises clusters. The new H-series Virtual Machines will also be supported by Azure Batch for cloud-native batch processing. HBv2 Virtual Machines will be available to all Azure platform partners.
Customers can sign up for HBv2 access today by filling out this form. HBv2 Virtual Machines will initially be available in the South Central US and West Europe Azure regions, with availability in additional regions soon thereafter.