Using the Azure portal, you can provision Hadoop clusters in Azure HDInsight, change Hadoop user password, and enable Remote Desktop Protocol (RDP) so you can access the Hadoop command console on the cluster.
There are also other tools available for administering HDInsight in addition to the Azure portal.
For more information on administering HDInsight by using Azure PowerShell, see Administer HDInsight Using Azure PowerShell.
For more information on administering HDInsight by using the Azure CLI, see Administer HDInsight Using Azure CLI.
Before you begin this article, you must have the following:
You can provision HDInsight clusters from the Azure portal by using the Quick Create or Custom Create option. See the following links for instructions:
The storage account must be located in the same datacenter as the HDInsight cluster. For available datacenters, see the Regions menu on the HDInsight Pricing page.
HDInsight works with a wide range of Hadoop components. For the list of the components that have been verified and supported, see What version of Hadoop is in Azure HDInsight. You can customize HDInsight by using one of the following options:
Some native Java components, like Mahout and Cascading, can be run on the cluster as JAR files. These JAR files can be distributed to Azure Blob storage, and submitted to HDInsight clusters through Hadoop job submission mechanisms. For more information, see Submit Hadoop jobs programmatically.
If you have issues deploying JAR files to HDInsight clusters or calling JAR files on HDInsight clusters, contact Microsoft Support.
Cascading is not supported by HDInsight, and is not eligible for Microsoft Support. For lists of supported components, see What's new in the cluster versions provided by HDInsight?.
Installation of custom software on the cluster by using Remote Desktop Connection is not supported. You should avoid storing any files on the drives of the head node, as they will be lost if you need to re-create the clusters. We recommend storing files on Azure Blob storage. Blob storage is persistent.
An HDInsight cluster can have two user accounts. The HDInsight cluster user account is created during the provisioning process. You can also create an RDP user account for accessing the cluster via RDP. See Enable remote desktop.
To change the HDInsight cluster user name and password
The credentials for the cluster that you provided at its creation give access to the services on the cluster, but not to the cluster itself through Remote Desktop. Remote Desktop access is turned off by default, and so direct access to the cluster using it requires some additional, post-creation configuration.
To enable Remote Desktop
In the Configure Remote Desktop wizard, enter a user name and password for the remote desktop. Note that the user name must be different from the one used to create the cluster (admin by default with the Quick Create option). Enter an expiration date in the EXPIRES ON box. Note that the expiration date must be in the future and up to 90 days from today. The expiration time of day is assumed by default to be midnight of the specified date. Then click the check icon.
You can also use the HDInsight .NET SDK to enable Remote Desktop on a cluster. Use the EnableRdp method on the HDInsight client object in the following manner: client.EnableRdp(clustername, location, "rdpuser", "rdppassword", DateTime.Now.AddDays(6)). Similarly, to disable Remote Desktop on the cluster, you can use client.DisableRdp(clustername, location). For more information on these methods, see HDInsight .NET SDK Reference. This is applicable only for HDInsight clusters running on Windows.
Once RDP is enabled for a cluster, you must refresh the page before you can connect to the cluster.
To connect to a cluster by using RDP
If you want to perform any operations on the cluster by using the .NET SDK, you must create a self-signed certificate on the workstation, and also upload the certificate to your Azure subscription. This is a one-time task. You can install the same certificate on other machines, as long as the certificate is valid.
To create a self-signed certificate
Create a self-signed certificate that is used to authenticate the requests. You can use Internet Information Services (IIS) or makecert to create the certificate.
Browse to the location of the certificate, right-click the certificate, click Install Certificate, and install the certificate to the computer's personal store. Edit the certificate properties to assign it a friendly name.
Import the certificate into the Azure portal. From the portal, click SETTINGS on the bottom left of the page, and then click MANAGEMENT CERTIFICATES. From the bottom of the page, click UPLOAD and follow the instructions to upload the .cer file you created in the previous step.
HDInsight clusters have the following HTTP web services (all of these services have RESTful endpoints):
By default, these services are granted for access. You can revoke/grant the access from the Azure portal.
To grant/revoke HTTP web services access
This can also be done through the Azure PowerShell cmdlets:
To connect to the cluster by using Remote Desktop and use the Hadoop command line, you must first have enabled Remote Desktop access to the cluster as described in the previous section.
To open a Hadoop command line
From the desktop, double-click Hadoop Command Line.
For more information on Hadoop commands, see Hadoop commands reference.
In the previous screenshot, the folder name has the Hadoop version number embedded. The version number can changed based on the version of the Hadoop components installed on the cluster. You can use Hadoop environment variables to refer to those folders. For example:
cd %hadoop_home% cd %hive_home% cd %pig_home% cd %sqoop_home% cd %hcatalog_home%
In this article, you have learned how to create an HDInsight cluster by using the Azure portal, and how to open the Hadoop command-line tool. To learn more, see the following articles: