Apache Storm for HDInsight
Real-time stream processing made easy for big data
- Stream millions of events per second
- Real-time computation system
- Built on industry-leading open-source platform
- Highly available and fault tolerant
- Cloud elasticity
- Integration with Visual Studio
- No hardware to buy
- Deploy in a few clicks
What is Apache Storm?
Apache Storm is a distributed, fault-tolerant, open-source, real-time event processing solution for large, fast streams of data. First made famous by Twitter, which used the technology on its massive tweet streams, Storm is a project of The Apache Software Foundation. Azure makes Apache Storm easy and cost-effective to deploy, with no hardware to buy, no software to configure, your choice of development tools (Java or C#) and deep integration with Visual Studio.
Real-time processing for real-time challengesToday’s connected world is defined by big data that arrives in real-time. Storm is ideal for challenging real-time scenarios like fraud detection, clickstream analysis, financial alerts, telemetry from Internet of Things (IoT) sensors and devices, social analytics, always-on ETL pipelines and network monitoring. Your customers can source these real-time events from devices, sensors, infrastructure, applications, websites and data.
Easy setup, fast resultsThere is no time-consuming installation or setup with Storm for HDInsight. Azure does it for you. Get up and running in minutes and deploy Storm without buying new hardware or paying other up-front costs.
Integrated development environment for easier and faster resultsStorm is simple to use and supports any programming language, including Java and .NET. Built-in integration with the Visual Studio IDE means that you can develop, deploy and debug Storm topologies quickly and easily. You can mix spouts written in other languages, which means that you can take advantage of the universe of existing spouts and bolts as part of your topology.
Elastic capacity for big dataStorm for HDInsight takes advantage of the power of Azure, which makes it easier for you to create clusters of any size, to process any amount of data on demand. We charge only for the compute and storage that you actually use.
High availability for business continuityStorm is fault-tolerant and automatically restarts workers on other nodes in case of failure. Storm for HDInsight takes this a step further with 99.9% uptime for your Storm clusters. Azure also provides 24x7 enterprise support and cluster monitoring.
Deploy your first Apache Storm analytics pipelineDeploying an Apache Storm cluster and running your first real-time analytics pipeline can be done in minutes.
1. Use your Azure subscription or create a trial account to log on to the Azure portal.
2. Give a name to the Storm cluster and pick the number of nodes to define the size of the cluster. You can deploy a Storm cluster from 1 node all the way to hundreds of nodes. We also allow you to scale up or scale down a running Storm cluster.
3. It usually takes 15 minutes to deploy a Storm cluster. Once it is deployed, click STORM DASHBOARD at the bottom of the page to deploy your first storm topology.
4. Provide the username and password which you chose when creating the cluster.
5. From the drop down, either pick one of the sample topologies, or you can upload a new topology, which should be compiled as a JAR file.
6. Click Submit to deploy the WorkCount topology. This topology counts the number of words which are present in a storm of sentences which are coming as input.
7. Once the submission is completed, you can click Storm UI to monitor the running topology.
It is easy to build, deploy and manage Storm topologies all from within the Visual Studio environment. Azure SDK also ships with easy-to-get-started templates for Storm on HDInsight. The Visual Studio integrated experience increases productivity and allows you to do full project management from within the Visual Studio environment.