• 2 min read

Getting Started with Windows Azure HDInsight Service

Editor's Note: This post comes from Matt Winkler, Principal Program Manager at Microsoft.  This morning we made some big announcements about delivering Hadoop for Windows Azure users. Windows…

Editor’s Note: This post comes from Matt Winkler, Principal Program Manager at Microsoft.

 This morning we made some big announcements about delivering Hadoop for Windows Azure users. Windows Azure HDInsight Service is the easiest way to deploy, manage and scale Hadoop based solutions. This release includes:

  • Hadoop updates that ensure the latest stable versions of:
    • HDFS and Map/Reduce
    • Pig
    • Hive
    • Sqoop
  • Increased availability of the preview service
  • A local, developer installation of Microsoft HDInsight Server
  • An SDK for writing Hadoop jobs using .NET and Visual Studio

Community Contributions 

As part of our ongoing commitment to Apache™ Hadoop®, the team has been actively working to submit our changes to Apache™.  You can follow the progress of this work by following  branch-1-win for check-ins related to HDFS and Map/Reduce.  We’re also contributing patches to other projects, including Hive, Pig and HBase.  This set of components is just the beginning, with monthly refreshes ahead we’ll be adding additional projects, such as HCatalog. 

Getting Access to the HDInsight Service

In order to get started, head to and submit the invitation form. We are sending out invitation codes as capacity allows.  Once in the preview, you can provision a cluster, for free, for 5 days.  We’ve made it super easy to leverage Windows Azure Blob storage, so that you can store your data permanently in Blob storage, and bring your Hadoop cluster online only when you need to process data.  In this way, you only use the compute you need, when you need it, and take advantage of the great features of Windows Azure storage, such as geo-replication of data and using that data from any application.

Simplifying Development

Hadoop has been built to allow a rich developer ecosystem, and we’re taking advantage of that in order to make it easier to get started writing Hadoop jobs using the languages you’re familiar with.  In this release, you can use JavaScript to build Map/Reduce jobs, as well as compose Pig and Hive queries using the JavaScript console hosted on the cluster dashboard.  The JavaScript console also provides the ability to explore data and refine your jobs in an easy syntax, directly from a web browser. 

For .NET developers, we’ve built an API on top of Hadoop streaming that allows for writing Map/Reduce jobs using .NET.  This is available in NuGet, and the code is hosted on CodePlex. Some of the features include:

  • Choice of loose or strong typing
  • In memory debugging
  • Submission of jobs directly to a Hadoop cluster
  • Samples in C# and F#

Get Started

  • Sign up for the Windows Azure HDInsight Service Preview.
  • Download the Microsoft HDInsight Server community technology preview.
  • Get started with the .NET SDK For Hadoop.

     – Matt Winkler, Principal Program Manager.