5 min read
For more than three decades, the researchers and practitioners that make up the high-performance computing (HPC) community will come together for their annual event. More than ten thousand strong in attendance, the global community will converge on Denver, Colorado to advance the state-of-the-art in HPC. The theme for Supercomputing ‘19 is “HPC is now” – a theme that resonates strongly with the Azure HPC team given the breakthrough capabilities we’ve been working to deliver to customers.
Azure is upending preconceptions of what the cloud can do for HPC by delivering supercomputer-grade technologies, innovative services, and performance that rivals or exceeds some of the most optimized on-premises deployments. We’re working to ensure Azure is paving the way for a new era of innovation, research, and productivity.
At the show, we’ll showcase Azure HPC and partner solutions, benchmarking white papers and customer case studies – here’s an overview of what we’re delivering.
- Massive Scale MPI – Solve problems at the limits of your imagination, not the limits of other public cloud’s commodity network. Azure supports your tightly coupled workloads up to 80,000 cores per job featuring the latest HPC-grade CPUs and ultra-low latency InfiniBand hDR networking.
- Accelerated Compute – Choose from the latest GPUs, field programmable gate array (FPGAs), and now IPUs for maximum performance and flexibility across your HPC, AI, and visualization workloads.
- Apps and Services – Leverage advanced software and storage for every scenario: from hybrid cloud to cloud migration, from POCs to production, from optimized persistent deployments to agile environment reconfiguration. Azure HPC software has you covered.
Azure HPC unveils new offerings
- The preview of new second gen AMD EPYC based HBv2 Azure Virtual Machines for HPC – HBv2 virtual machines (VMs) deliver supercomputer-class performance, message passing interface (MPI) scalability, and cost efficiency for a variety of real-world HPC workloads. HBv2 VMs feature 120 non-hyperthreaded CPU cores from the new AMD EPYC™ 7742 processor, up to 45 percent more memory bandwidth than x86 alternatives, and up to 4 teraFLOPS of double-precision compute. Leveraging the cloud’s first deployment of 200 Gigabit HDR InfiniBand from Mellanox, HBv2 VMs support up to 80,000 cores per MPI job to deliver workload scalability and performance that rivals some of the world’s largest and most powerful bare metal supercomputers. HBv2 is not just one of the most powerful HPC servers Azure has ever made, but also one of the most affordable. HBv2 VMs are available now.
- The preview of new NVv4 Azure Virtual Machines for virtual desktops – NVv4 VMs enhance Azure’s portfolio of Windows Virtual Desktops with the introduction of the new AMD EPYC™ 7742 processor and the AMD RADEON INSTINCT™ MI25 GPUs. NVv4 gives customers more choice and flexibility by offering partitionable GPUs. Customers can select a virtual GPU size that is right for their workloads and price points, with as little as 2GB of dedicated GPU frame buffer for an entry-level desktop in the cloud, up to an entire MI25 GPU with 16GB of HBM2 memory to provide powerful workstation-class experience. NVv4 VMs are available now in preview.
- The preview NDv2 Azure GPU Virtual Machines – NDv2 VMs are the latest, fastest, and most powerful addition to Azure GPU family, and are designed specifically for the most demanding distributed HPC, artificial intelligence (AI), and machine learning (ML) workloads. These VMs feature 8 NVIDIA Tesla V100 NVLink interconnected GPUs each with 32 GB of HBM2 memory, 40 non-hyperthreaded cores from the Intel Xeon Platinum 8168 processor, and 672 GiB of system memory. NDv2-series virtual machines also now feature 100 Gigabit EDR InfiniBand from Mellanox with support for standard OFED drivers and all MPI types and versions. NDv2-series virtual machines are ready for the most demanding machine learning models and distributed AI training workloads with out-of-box NCCL2 support for InfiniBand- allowing for easy scale-up to supercomputer-sized clusters that can run workloads utilizing CUDA, including popular ML tools and frameworks. NDv2 VMs are available now in preview.
- Azure HPC Cache now available – When it comes to file performance, the new Azure HPC Cache delivers flexibility and scale. Use this service right from the Azure Portal to connect high-performance computing workloads to on-premises network-attached storage or run Azure Blob as the portable operating system interface.
- The preview of new NDv3 Azure Virtual Machines for AI –The preview of NDv3 VMs are Azure’s first offering featuring the Graphcore IPU, designed from the ground up for AI training and inference workloads. The IPU novel architecture enables high-throughput processing of neural networks even at small batch sizes, which accelerates training convergence and enables short inference latency. With the launch of NDv3, Azure is bringing the latest in AI silicon innovation to the public cloud and giving customers additional choice in how they develop and run their massive scale AI training and inference workloads. NDv3 VMs feature 16 IPU chips, each with over 1200 cores and over 7000 independent threads and a large 300MB on-chip memory that delivers up to 45 TB/s of memory bandwidth. The 8 Graphcore accelerator cards are connected through a high-bandwidth, low-latency interconnect enabling large models to be trained in a model-parallel (including pipelining) or data parallel way. An NDv3 VM also includes 40 cores of CPU, backed by Intel Xeon Platinum 8168 processors and 768 GB of memory. Customers can develop applications for IPU technology using Graphcore’s Poplar® software development toolkit, and leverage the IPU-compatible versions of popular machine learning frameworks. NDv3 VMs are now available in preview.
NP Azure Virtual Machines for HPC coming soon – Our Alveo U250 FPGA-Accelerated VMs offer from 1-to-4 Xilinx U250 FPGA devices as an Azure VM- backed by powerful Xeon Platinum CPU cores, and fast NVMe-based storage. The NP series will enable true lift-and-shift and single-target development of FPGA applications for a general purpose cloud. Based on a board and software ecosystem customers can buy today, RTL and high-level language designs targeted at Xilinx’s U250 card and SDAccel 2019.1 runtime will run on Azure VMs just as they do on-premises and on the edge, enabling the bleeding edge of accelerator development to harness the power of the cloud without additional development costs.
Azure CycleCloud 7.9 Update – We are excited to announce the release of Azure CycleCloud 7.9. Version 7.9 focuses on improved operational clarity and control, in particular for large MPI workloads on Azure’s unique InfiniBand interconnected infrastructure. Among many other improvements, this release includes:
Improved error detection and reporting user interface (UI) that greatly simplify diagnosing VM issues.
Node time-to-live capabilities via a “Keep-alive” function, allowing users to build and debug MPI applications that are not affected by autoscaling policies.
VM placement group management through the UI that provides users direct control into node topology for latency sensitive applications.
Support for Ephemeral OS disks, which improve virtual machines and virtual machines scale sets start-up performance and cost.
Microsoft HPC Pack 2016, Update 3 – released in August, Update 3 includes significant performance and reliability improvements, support for Windows Server 2019 in Azure, a new VM extension for deploying Azure IaaS Windows nodes, and many other features, fixes, and improvements.
In all of our new offerings and alongside our partners, Azure HPC aims to consistently offer the latest in capabilities for HPC oriented use cases. Together with our partners Intel, AMD, Mellanox, NVIDIA, Graphcore, Xilinx, and many more, we look forward to seeing you next week in Denver!