Microsoft brings the familiarity of R to the scalability of Hadoop and Spark in the Cloud

Publisert på 29 mars, 2016

Product Marketing, Hadoop/Big Data and Data Warehousing

Today at Strata + Hadoop World, we announced the availability of R Server inside Azure HDInsight, our managed Hadoop-as-a-service part of Azure Data Lake. This gives Azure HDInsight the most comprehensive set of ML algorithms and statistical functions in the cloud that also leverages Hadoop and Spark.

By making R Server available as a workload inside HDInsight, we remove obstacles for users to unlock the power of R by eliminating memory and processing constraints and extending analytics from the laptop to large multi-node Hadoop and Spark clusters. This enables the ability to train and run ML models on larger datasets than previously possible to make more accurate predictions that affect the business.

What is R Server on HDInsight?

R is one of the most popular programming language that helps millions of data scientists solve their most challenging problems in fields ranging from computational biology to quantitative marketing. R Server for Azure HDInsight is a scale-out implementation of R integrated with Hadoop and Spark clusters created from HDInsight. It is the only 100% open source R implementation that runs in the cloud on Hadoop and Spark. This gives you the familiarity of the R language for machine learning while leveraging the scalability and reliability built into Hadoop and Spark. It also eliminates memory and processing constraints and easily extends their code from their laptop to large multi-terabyte files producing models that are more powerful and accurate.


R Server for HDInsight is built on open standards

Microsoft R is 100% compatible with Open Source R and any library that exists can be used in the R Server context. Additionally, R server can leverage the power of Hadoop to parallelize any existing R function to multiple nodes, letting you use your existing knowledge and code investments. It becomes really simple to do parameterized sweeps or simulate models with different initial conditions. R Server on HDInsight also allows you to use your preferred open source IDE (e.g. RStudio).


How can I get started?

To get started, customers will need to have an Azure subscription or a free trial to Azure. With this in hand, you should be able to get a R Server for HDInsight cluster up and running in minutes.

Check out the links below for more information.