Apache Storm for HDInsight

Real-time stream processing made easy for big data

What is Apache Storm?

Apache Storm is a distributed, fault-tolerant, open-source, real-time event processing solution for large, fast streams of data. First made famous by Twitter, which used the technology on its massive tweet streams, Storm is a project of The Apache Software Foundation. Azure makes Apache Storm easy and cost-effective to deploy, with no hardware to buy, no software to configure, your choice of development tools (Java or C#) and deep integration with Visual Studio.

Watch a quick overview >

Data comes in from various sources (applications, devices, sensors, web, social) and is collected in the cloud via web APIs or field gateways. The data is put into a queueing service like Event Hubs, Kafka, RabbitMQ or ActiveMQ, for real-time data processing with Apache Storm on HDInsight. The data moves to long-term storage with Apache HBase on HDInsight, where you can run your real-time dashboards, queries and analytics.

Real-time processing for real-time challenges

Today’s connected world is defined by big data that arrives in real time. Storm is ideal for challenging real-time scenarios, such as fraud detection, clickstream analysis, financial alerts, telemetry from Internet of Things (IoT) sensors and devices, social analytics, always-on ETL pipelines and network monitoring. Your customers can source these real-time events from devices, sensors, infrastructure, applications, websites and data.

Easy setup, fast results

There’s no time-consuming installation or setup with Storm for HDInsight. Azure does it for you. Get up and running in minutes, and deploy Storm without buying new hardware or paying other up-front costs.

Integrated development environment for easier and faster results

Storm is simple to use and supports any programming language, including Java and .NET. Built-in integration with the Visual Studio IDE means that you can develop, deploy and debug Storm topologies quickly and easily. You can mix spouts written in other languages, which means that you can take advantage of the universe of existing spouts and bolts as part of your topology.

Elastic capacity for big data

Storm for HDInsight takes advantage of the power of Azure, which makes it easier for you to create clusters of any size to process any amount of data on demand. We only charge for the compute and storage that you actually use.

High availability for business continuity

Storm is fault-tolerant and automatically restarts workers on other nodes in case of failure. Storm for HDInsight takes this a step further with 99.9% uptime for your Storm clusters. Azure also provides 24/7 enterprise support and cluster monitoring.

Deploy your first Apache Storm analytics pipeline

Deploying an Apache Storm cluster and running your first real-time analytics pipeline can be done in minutes.

Use your Azure subscription or create a trial account to log in to the Azure portal.

Give a name to the Storm cluster and pick the number of nodes to define the size of the cluster. You can deploy a Storm cluster from 1 node all the way to hundreds of nodes. We also allow you to scale up or down a running Storm cluster.

It usually takes 15 minutes to deploy a Storm cluster. Once it has been deployed, click STORM DASHBOARD at the bottom of the page to deploy your first storm topology.

Provide the username and password that you chose when creating the cluster.

From the drop-down list, you can either pick one of the sample topologies or upload a new topology, which should be compiled as a JAR file.

Click Submit to deploy the WorkCount topology. This topology counts the number of words that are present in a storm of sentences that are coming as input.

Once the submission is complete, you can click Storm UI to monitor the running topology.

It’s easy to build, deploy and manage Storm topologies all from within the Visual Studio environment. Azure SDK also ships with easy-to-get-started templates for Storm on HDInsight. The Visual Studio integrated experience increases productivity and allows you to do full project management from within the Visual Studio environment.

Try HDInsight for free