Apache Storm for HDInsight
Real-time stream processing made easy for big data
- Stream millions of events per second
- Real-time computation system
- Built on industry leading open-source platform
- Highly available and fault tolerant
- Cloud elasticity
- Integration with Visual Studio
- No hardware to buy
- Deploy in a few clicks
What is Apache Storm?Apache Storm is a distributed, fault-tolerant, open-source, real-time event processing solution for large, fast streams of data. First made famous by Twitter, which used the technology on its massive tweet streams, Storm is now a project of the Apache Software Foundation. The Azure cloud makes Apache Storm easy and cost-effective to deploy, with no hardware to buy, no software to configure, your choice of development tools (Java or C#), and deep integration to Visual Studio. Watch a quick overview.
Real-time processing for real-time challengesToday’s connected world is defined by big data that arrives in real-time. Storm is ideal for challenging real-time scenarios like fraud detection, click-stream analysis, financial alerts, telemetry from connected sensors and devices (IoT), social analytics, "always on" ETL pipelines, and network monitoring. Customers can source these real-time events from devices, sensors, infrastructure, applications, websites, and data.
Easy setup, fast resultsWith Storm for HDInsight, there’s no time-consuming installation or set up. Azure does it for you. You’ll be up and running in minutes, and can deploy Storm without buying new hardware or incurring other up-front costs.
Integrated development environment for easier and faster resultsStorm is simple to use and supports any programming language—including Java and .NET. Built-in integration with the Visual Studio IDE means that you can develop, deploy, and debug Storm topologies quickly and easily. You can even mix spouts written in other languages, meaning that you can leverage the vast universe of existing spouts and bolts as part of your topology.
Elastic capacity for big dataStorm for HDInsight leverages the power of the Azure cloud, making it easier to create clusters of any size to process any amount of data on demand. We charge only for the compute and storage you actually use.
High availability for guaranteed business continuityStorm is fault tolerant, and automatically restarts workers on other nodes in case of failure. Storm for HDInsight takes this a step further—guaranteeing 99.9% up time for your Storm clusters. Azure also offers 24x7 enterprise support and cluster monitoring.
Deploy your first Apache Storm analytics pipelineDeploying an Apache Storm cluster and running your first real-time analytics pipeline can be done in minutes.
Use your Azure subscription or create a trial account to log on to the Azure portal.
Give a name to the Storm cluster, and pick the number of nodes to define the size of the cluster. You can deploy a Storm cluster from 1 node all the way to hundreds of nodes. We also allow you to scale up or scale down a running Storm cluster.
It usually takes 15 minutes to deploy a Storm cluster. Once it is deployed, click STORM DASHBOARD at the bottom of the page to deploy your first storm topology.
Provide the username and password that you chose when creating the cluster.
From the drop down, either pick one of the sample topologies, or you can upload a new topology, which should be compiled as a JAR file.
Click Submit to deploy the WorkCount topology. This topology counts the number of words that are present in a storm of sentences that are coming as input.
Once the submission is completed, you can click Storm UI to monitor the running topology.
It's easy to build, deploy and manage Storm topologies all from within the Visual Studio environment. Azure SDK also ships with easy-to-get-started templates for Storm on HDInsight. The Visual Studio integrated experience increases productivity and allows you to do full project management from within the Visual Studio environment.