A basic example of a dotnet-based Apache Storm topology and how to run it on HDInsight. This example randomly emits a sentence, which is then broken into words. Finally, the words are counted and the word and count is emitted.
One of the following versions of Visual Studio
Azure SDK 2.5.1 or later
HDInsight Tools for Visual Studio: See Get started using HDInsight Tools for Visual Studio to install and configure the HDInsight tools for Visual Studio.
SCP.NET package version
The SCP.NET package version that you use for this project depends on the version of Storm installed on your HDInsight cluster. Use the following table to determine what SCP.NET version you must use in your project:
|HDInsight version||Apache Storm version||SCP.NET version|
About the code
This example Storm topology is implemented using the following components:
Spout.cs - This component emits one of five sentences to the output stream when it is called by the topology.
Splitter.cs - This component reads the sentences emitted by the spout and splits it into individual words. Each word is then emitted to the output stream.
Counter.cs - This component reads the words emitted by the splitter component and keeps a count of how many times a word has occurred.
Program.cs - This defines the topology, which describes how the data flows between the components, how many instances of each component to create, and other configuration information for the topology.
The following is a visual representation of the data flow between the components.
Deploy this sample to Azure
Once you have downloaded this project, open it using Visual Studio.
In Solution Explorer, right click the WordCount project and then select Build. This should download any required dependencies and build the project.
In Solution Explorer, right click the WordCount project and then select Submit to Storm on HDInsight. If prompted, authenticate to your Azure subscription.
In the Submit Topology dialog, select your cluster using the Storm Cluster dropdown list. Leave the other fields at the default values, and then select Submit.
Note: It may take a few seconds to populate the list of Storm Clusters.
Once the topology has been submitted, the Topology View will open. This displays the currently deployed topologies for the Storm cluster. Select the WordCount entry to display information about the topology.
Double click the Counter entry. This will display information for the Counter bolt.
From the Executors section, select the Port. This will display a log of information emitted by this bolt.
You can see the information logs, including the emitted tuples that contain information about the number of times a word has occurred.
To stop the topology, return to the Topology View, select the topology, and then select the Kill button above the topology information.
For more information on working with C# topologies and Storm on HDInsight, see Develop C# topologies for Apache Storm on HDInsight using HDInsight tools for Visual Studio.