4 min read
Data scientists have a dynamic role. They need environments that are fast and flexible while upholding their organization’s security and compliance policies.
Data scientists working on machine learning projects need a flexible environment to run experiments, train models, iterate models, and innovate in. They want to focus on building, training, and deploying models without getting bogged down in prepping virtual machines (VMs), vigorously entering parameters, and constantly going back to IT to make changes to their environments. Moreover, they need to remain within compliance and security policies outlined by their organizations.
Organizations seek to empower their data scientists to do their job effectively, while keeping their work environment secure. Enterprise IT pros want to lock down security and have a centralized authentication system. Meanwhile, data scientists are more focused on having direct access to virtual machines (VMs) to tinker at the lower level of CUDA drivers and special versions of the latest machine learning frameworks. However, direct access to the VM makes it hard for IT pros to enforce security policies. Azure Machine Learning service is developing innovative features that allow data scientists to get the most out of their data and spend time focusing on their business objectives while maintaining their organizations’ security and compliance posture.
Azure Machine Learning service’s Notebook Virtual Machine (VM), announced in May 2019, resolves these conflicting requirements while simplifying the overall experience for data scientists. Notebook VM is a cloud-based workstation created specifically for data scientists. Notebook VM based authoring is directly integrated into Azure Machine Learning service, providing a code-first experience for Python developers to conveniently build and deploy models in the workspace. Developers and data scientists can perform every operation supported by the Azure Machine Learning Python SDK using a familiar Jupyter notebook in a secure, enterprise-ready environment. Notebook VM is secure and easy-to-use, preconfigured for machine learning, and fully customizable.
Let’s take a look at how Azure Machine Learning service Notebook VMs are:
- Secure and easy-to-use
- Preconfigured for machine learning and,
- Fully customizable
1. Secure and easy to use
When a data scientist creates a notebook in standard infrastructure-as-a-service (IaaS) VM, it requires a lot of intricate, IT specific parameters. They need to name the VM and specify titles of images, security parameters (virtual network, subnet, and more), storage accounts, and a variety of other IT specific parameters. If incorrect parameters are given, or details are overlooked, this can open an organization up to serious security risks.
Compared to an IaaS VM, the Notebook VM creation experience has been streamlined, as it takes just two parameters – a VM name and a VM type. Once the Notebook VM is created it provides access to Jupyter and JupyterLab – two popular notebook environments for data science. The access to the notebooks is secured out-of-the-box with HTTPS and Azure Active Directory, which makes it possible for IT pros to enforce a single sign-on environment with strong security features like Multi-Factor Authentication, ensuring a secure environment in compliance with organizational policies.
2. Preconfigured for machine Learning
Setting up GPU drivers and deploying libraries on a traditional IaaS VM can be cumbersome and require substantial amounts of time. It can also get complicated finding the right drivers for given hardware, libraries, and frameworks. For instance, the latest versions of PyTorch may not work with the drivers a data scientist is currently using. Installation of client libraries for services such as Azure Machine Learning Python SDK can also be time-consuming, and some Python packages can be incompatible with others, depending on the environment where they are installed.
Notebook VM has the most up-to-date, compatible packages preconfigured and ready to use. This way, data scientists can use any of the latest frameworks on Notebook VM without versioning issues and with access to all the latest functionality of Azure Machine Learning service. Inside of the VM, along with Jupyter and JupyterLab, data scientists will find a fully prepared environment for machine learning. Notebook VM draws its pedigree from Data Science Virtual Machine (DSVM), a popular IaaS VM offering on Azure. Similar to DSVM it comes equipped with preconfigured GPU drivers and a selection of ML and Deep Learning Frameworks.
Notebook VM is also integrated with its parent, Azure Machine Learning workspace. The notebooks that data scientists run on the VM have access to the data stores and compute resources of the workspace. The notebooks themselves are stored in a Blob Storage account of the workspace. This makes it easy to share notebooks between VMs, as well as keeps them safely preserved when the VM is deleted.
3. Fully customizable
In environments where IT pros prepare virtual machines for data scientists, there is a very vigorous process for this preparation and limitations on what can be done on these machines. Alternatively, data scientists are very dynamic and need the ability to customize VMs to fit their ever-changing needs. This often means going back to IT pros to have them make the necessary changes to the VMs. Even then, data scientists hit blockers when iterations don’t meet their needs or take too long. Some data scientists will resort to using their personal laptop to run jobs their corporate VMs don’t support, breaking compliance policies and putting the organization at risk.
While Notebook VM is a managed VM offering, it retains full access to hardware capabilities. Data scientists can create VMs of any type, all supported by Azure. This way they can customize it to their heart’s desire by adding custom packages and drivers. For example, data scientists can quickly create the latest NVidia V100 powered VM to perform step-by-step debugging of novel neural network architectures.
If you are working with code, Notebook VM will offer you a smooth experience. It includes a set of tutorials and samples which make every capability of the Azure Machine Learning service just one-click away. Give it a try and let us know your feedback.