提供中

SR-IOV availability on InfiniBand-equipped Virtual Machines

公開日: 7月 24, 2020

We will be enabling support for all Message Passing Interface (MPI) implementations and Remote Direct Memory Access (RDMA) verbs for InfiniBand-equipped virtual machines.  This greatly increases ability and options for leveraging Infiniband for your workloads.

The upgrade WILL INVOLVE SERVER DOWNTIME on a regional basis and, if you intend to utilize the InfiniBand network, this REQUIRES AN UPDATE TO YOUR VMs

WHAT’S COMING? 

We will be enabling support for the entire MPI stack (all MPI implementations and RDMA verbs) for InfiniBand-equipped virtual machines.  These enhancements will increase the ability to leverage our high-bandwidth, low-latency InfiniBand network for your workloads.

IMPACT 

All users of VM sizes listed in the update schedule will be impacted on a region-by-region basis.  The update involves changes to both server hardware and software, which requires downtime.  During downtime: 

  • Machines in the region will be unavailable for a 3-hour period 
  • All VMs in the region will be removed & re-deployed after the update
  • Data stored on local (ephemeral) disks will be lost.  Storage Accounts are unaffected

ACTION REQUIRED 

To avoid data loss and minimize potential impact to your service, please: 

  • Ensure all jobs are complete and data is backed up to your Storage Account before the scheduled update.  Any data stored locally will be lost. 

If you do not require InfiniBand or MPI

o    You do not need to make any changes to your image/drivers

If you do require InfiniBand or MPI 

o    For managed services supporting InfiniBand scenarios, please see service-specific guidance (e.g., Azure BatchAzure Machine Learning). 

o    Update your OS to a supported version which includes inbox drivers for InfiniBand & test them beforehand (see last bullet)  

o    If not already included in your image, download and install the latest OFED driver (see steps here)

o    Test your updated image and drivers on VM sizes which are already SR-IOV enabled (see MPI section)

For any questions or concerns, please reach out to Azure GPU Feedback or Customer Service Support. 

  • Virtual Machines
  • Features

関連製品