The Azure Code Samples are currently available in English
Programmatically deploy a topology to Apache Storm on HDInsight
An example Java application that will programmatically submit an Apache Storm topology to a Storm on HDInsight cluster. This is based on the example from http://nishutayaltech.blogspot.in/2014/06/submitting-topology-to-remote-storm.html.
Java: The programming language that this example is written in.
Maven: A build management system for Java projects. Used to build the example code.
cURL: A cross-platform utility for working with REST APIs. Used to talk to the Ambari REST API on HDInsight.
jq: A cross-platform utility for working with JSON documents. In this case, the data returned from Ambari through cURL.
A Java-based Storm topology that you want to deploy. If you don't have one, consider using a basic wordcount topology.
Build and package
If you are using an HDInsight 3.2 or older cluster version, edit the pom.xml file and change the version for the storm-core dependency to 0.9.3.
<dependency> <groupId>org.apache.storm</groupId> <artifactId>storm-core</artifactId> <!-- change to the version of Storm that you are using --> <version>0.10.0</version> </dependency>
The project defaults to 0.10.0, which is the version of Storm provided with HDInsight 3.3 and 3.4. Older cluster versions used Storm 0.9.3.
From the command line, change directories to the project directory and use the following to build and package the project:
This will create a file named SubmitToNimbus-0.0.1-SNAPSHOT.jar in the target directory.
Create an SSH tunnel and deploy a topology
Storm topologies are deployed by submitting the topology package to the Nimbus service, which runs on the HDInsight head nodes. This service listens on port 6627; however, this port is not exposed publicly on the internet. In order to communicate with Nimbus from outside the cluster, you can forward port 6627 on your development environment to port 6627 on the cluster using an SSH tunnel.
Use the following steps to forward port 6627 in your development environment to the cluster:
Use the following to find the internal fully qualified domain name (FQDN) of your head nodes:
curl -u admin:PASSWORD -G "https://CLUSTERNAME.azurehdinsight.net/api/v1/clusters/CLUSTERNAME/services/HDFS/components/NAMENODE" | jq '.host_components.HostRoles.host_name'
You should receive a response similar to the following:
Save these values.
If you are using HDInsight 3.3 or 3.4, use the following command to forward your local port 6627 to hn1:
ssh -p 23 -C2qTnNf -L 6627:HN1-FQDN:6627 USERNAME@CLUSTERNAME-ssh.azurehdinsight.net
- Replace HN1-FQDN with the value from the previous step that begins with hn1.
- Replace USERNAME with the SSH user name for your HDInsight cluster.
- Replace CLUSTERNAME with the name of your HDInsight cluster.
If you secured your SSH account using a password, you will be prompted to enter it. If you used a certificate, you may need to use the
-iparameter to specify the location of the private key.
[AZURE.NOTE] If you are using HDInsight 3.2, use the following command instead:
ssh -p 22 -C2qTnNf -L 6627:HN0-FQDN:6627 USERNAME@CLUSTERNAME-ssh.azurehdinsight.net
Replace HN0-FQDN with the value from the previous step that begins with hn1.
Use the following to submit the topology using the example application
java -jar SubmitToNimbus-0.0.1-SNAPSHOT.jar <storm-topology-jar-file> <friendly-name-for-topology> <nimbus-host>
In this case, since we're using port forwarding from localhost to the remote headnode, use
localhostas the Nimbus host. If you are running the code directly on the cluster (like on the head node,) or on a machine connected into the same virtual network as HDInsight, then you'd use the actual host name.
Verify the topology was submitted
Assuming you received no errors during submission, you can view the topology using the Storm web UI for your HDInsight cluster. You can view this by pointing your browser to https://CLUSTERNAME.azurehdinsight.net/stormui. Replace CLUSTERNAME with the name of your HDInsight cluster. You should see the friendly name listed as a running topology.