With ever-increasing data volume and latency requirements, GPUs have become an indispensable tool for doing machine learning (ML) at scale. This week, we are excited to announce two integrations that Microsoft and NVIDIA have built together to unlock industry-leading GPU acceleration for more developers and data scientists.
- Azure Machine Learning service is the first major cloud ML service to integrate RAPIDS, an open source software library from NVIDIA that allows traditional machine learning practitioners to easily accelerate their pipelines with NVIDIA GPUs
- ONNX Runtime has integrated the NVIDIA TensorRT acceleration library, enabling deep learning practitioners to achieve lightning-fast inferencing regardless of their choice of framework.
These integrations build on an already-rich infusion of NVIDIA GPU technology on Azure to speed up the entire ML pipeline.
“NVIDIA and Microsoft are committed to accelerating the end-to-end data science pipeline for developers and data scientists regardless of their choice of framework,” says Kari Briski, Senior Director of Product Management for Accelerated Computing Software at NVIDIA. “By integrating NVIDIA TensorRT with ONNX Runtime and RAPIDS with Azure Machine Learning service, we’ve made it easier for machine learning practitioners to leverage NVIDIA GPUs across their data science workflows.”
Azure Machine Learning service integration with NVIDIA RAPIDS
Azure Machine Learning service is the first major cloud ML service to integrate RAPIDS, providing up to 20x speedup for traditional machine learning pipelines. RAPIDS is a suite of libraries built on NVIDIA CUDA for doing GPU-accelerated machine learning, enabling faster data preparation and model training. RAPIDS dramatically accelerates common data science tasks by leveraging the power of NVIDIA GPUs.
Exposed on Azure Machine Learning service as a simple Jupyter Notebook, RAPIDS uses NVIDIA CUDA for high-performance GPU execution, exposing GPU parallelism and high memory bandwidth through a user-friendly Python interface. It includes a dataframe library called cuDF which will be familiar to Pandas users, as well as an ML library called cuML that provides GPU versions of all machine learning algorithms available in Scikit-learn. And with DASK, RAPIDS can take advantage of multi-node, multi-GPU configurations on Azure.
ONNX Runtime integration with NVIDIA TensorRT in preview
We are excited to open source the preview of the NVIDIA TensorRT execution provider in ONNX Runtime. With this release, we are taking another step towards open and interoperable AI by enabling developers to easily leverage industry-leading GPU acceleration regardless of their choice of framework. Developers can now tap into the power of TensorRT through ONNX Runtime to accelerate inferencing of ONNX models, which can be exported or converted from PyTorch, TensorFlow, MXNet and many other popular frameworks. Today, ONNX Runtime powers core scenarios that serve billions of users in Bing, Office, and more.
With the TensorRT execution provider, ONNX Runtime delivers better inferencing performance on the same hardware compared to generic GPU acceleration. We have seen up to 2X improved performance using the TensorRT execution provider on internal workloads from Bing MultiMedia services.
Accelerating machine learning for all
Our collaboration with NVIDIA marks another milestone in our venture to help developers and data scientists deliver innovation faster. We are committed to accelerating the productivity of all machine learning practitioners regardless of their choice of framework, tool, and application. We hope these new integrations make it easier to drive AI innovation and strongly encourage the community to try it out. Looking forward to your feedback!