R Server for HDInsight

Predictive analytics, machine learning and statistical modelling for big data using R

  • Largest portable R-parallel analytics and machine learning library
  • Terabyte-scale machine learning – 1,000 times larger than in open-source R
  • Deliver up to 50 times faster performance using R Server for Apache Spark 2.0 and optimised vector/maths libraries
  • Enterprise-grade security and support backed by a Microsoft SLA
  • Access Spark data sources through Spark SQL
  • Easy setup for fast results
R Server for HDInsight

What is R Server for HDInsight?

By combining enterprise-scale R analytics software with the power of Hadoop and Spark, R Server for HDInsight provides unprecedented scale and performance. Multi-threaded maths libraries and transparent parallelisation in R Server handle up to 1,000 times more data and up to 50 times faster speeds than open-source R, helping you train more accurate models for better predictions than previously possible. Plus, because R Server is built to work with the open-source R language, all of your R scripts run without changes.

Working with the power and familiarity of R

A top choice among data scientists, the R programming language has a thriving global community of more than two million users worldwide, and the total number of open-source analytics packages is growing exponentially year after year. With R Server for HDInsight, you get full compatibility with the R language running at scale on Hadoop and Spark.

R usage is on the rise. From 2007 to 2013, the number of data miners that report using R increased from 20% to 70%. From 2008 to 2013, the number of data miners that use R as their primary tool increased from less than 5% to 24%.
The number of CRAN packages released has increased significantly in the last few years. In 2005, there were very few. The number increased to 1,000 by 2012, to 3,000 by 2014 and to over 8,000 by 2016.
R analytics and machine learning library

Largest portable, R-parallel analytics and machine learning library

Take advantage of the largest parallel analytics and machine learning library built to work with the open-source R language that's portable across popular data platforms, including decision trees and ensembles, regression models, clustering, data preparation, visualisation and statistical functions.

Use terabytes of data with R Server for HDInsight

Terabyte-scale machine learning handles 1,000 times more data

With transparent parallelisation on top of Hadoop and Spark, R Server for HDInsight lets you handle terabytes of data – 1,000 times more than the open-source R language alone. Train logistic regression models, trees and ensembles on any amount of data. You're only limited by the size of your Spark cluster.

Fast performance with R Server for HDInsight

Get up to 50 times faster performance

Combine Spark, multi-threaded vector and matrix maths libraries and R Server for HDInsight to experience up to 50 times faster performance than previously possible with open-source R.

Run open-source R functions

Run distributed parameter sweeps and simulations with existing R functions

Run any open-source R function over hundreds of nodes for parallel parameter sweeps and simulations. Explore and refine your models for faster, easier and more accurate predictions.

Access Spark data sources through Spark SQL

Analysing data in Hadoop and Spark is now even easier using Spark SQL as a data source for R Server. Load the results of a Spark SQL query against sources such as Apache Hive and Parquet to a Spark Data Frame, and analyse it directly using any R Server-distributed computing algorithms.

Use the development tools of your choice

R Server on HDInsight includes R Studio Server Community Edition, making it easy for data scientists to get started quickly. You can also download R Tools for Visual Studio for free for a convenient local development environment.

Enterprise-grade security to protect R Server for HDInsight

Enterprise-grade security and support

Rely on enterprise-grade security and support from Azure, including version packages, patching, security updates and continuous cluster monitoring. Plus, a Microsoft-backed Service Level Agreement (SLA) with 99.9% guaranteed connectivity helps protect your R Server for HDInsight clusters against catastrophic events.

Quick setup and no up-front costs

Easy setup, fast results

With R Server for HDInsight, there's no time-consuming installation or setup, because Azure does it for you. You'll be up and running in minutes, ready to train your statistical and machine learning models without buying new hardware or incurring other upfront costs. Only pay for the compute power and storage that you use.

Apache Hadoop® and associated open-source project names are trademarks of the Apache Software Foundation.

Try R Server for HDInsight