What is Hadoop on HDInsight? What is Spark? Analytics with R Server What is HBase? What is Storm? Service offerings, components & versions Documentation backlog The Hadoop ecosystem Learning guide Start with Hadoop Start with R Server Start with Spark Start with HBase & NoSQL Start with Storm Tools for Visual Studio Storage options Hive with Hadoop Pig with Hadoop MapReduce samples On-demand clusters Submit Hadoop jobs Storage options Install RStudio Compute contexts Analyze data with Power BI Create a standalone app Stream with Event Hubs Machine Learning: Predict food inspection result Machine Learning: Predict building temperature Website log analysis Kernels for Jupyter Packages for Jupyter Notebooks Use local Jupyter notebook Remote jobs with Livy IntelliJ IDEA plugin IntelliJ for remote debugging Manage cluster resources Zeppelin notebook Debug Spark jobs Known issues for Spark Phoenix & SQLLine client Analyze real-time tweets Configure on Virtual Network Configure geo-replication Develop a Java app Process IoT data More Storm examples Storm topology dashboard Develop C# topologies Develop Java-based topologies Process events Use Power BI on a topology Process sensor data in real-time Analyze stored sensor data Real-time sensor data analytics Analyze stored tweets Real-time Twitter trends Analyze real-time tweets Analyze flight delay data Recommendations with Mahout Analyze website logs Process IoT data Publish HDInsight applications Install HDInsight applications on clusters Install custom applications Use REST to install applications Install applications on clusters Customize clusters with script actions Customize clusters with Bootstrap Giraph on clusters Solr on clusters R language on clusters Hue on clusters Extend with Virtual Network Connect with SSH Create clusters Manage clusters Upload data Manage Spark cluster resources Manage & monitor with Ambari Availability & reliability REST API reference Spark REST API for remote jobs PowerShell cmdlets .NET SDK for Hadoop .NET SDK for HBase .NET library for Avro Optimize Hive queries Process JSON using Hive Python with Hive & Pig Hive, Pig & user-defined functions Python streaming programs Serialize data with Avro Tips for Hadoop on Linux Release notes Cluster status & error codes YARN application logs Blob storage heap dumps Stack trace errors Get help on the forum Get started in the Hadoop ecosystem with a Hadoop sandbox on a virtual machine
You can set up a Hadoop sandbox from Hortonworks on a virtual machine to learn about the Hadoop ecosystem. The sandbox provides a local development environment to learn about Hadoop, Hadoop Distributed File System (HDFS), and job submission.
To get started with a Hortonworks Hadoop Sandbox, see
Hortonworks Sandbox and look at the section Hortonworks Sandbox on a VM. We recommend you run through the tutorials available with the sandbox to get an understanding of HDFS, how jobs are submitted to a cluster, how to track jobs running on a cluster, and so on.
Once you are familiar with Hadoop, you can start using Hadoop on Azure by creating an HDInsight cluster. For more information on how to get started, see
Get started with Hadoop on HDInsight.