Azure HDInsight previewing HBase clusters as a NoSQL database on Azure Blobs

On June 3, Microsoft announced an update to HDInsight to support Hadoop 2.4 for 100x faster queries.  Today, we are announcing the preview of Apache HBase clusters inside HDInsight.

HBase is a low-latency NoSQL database that allows online transactional processing (OLTP) of big data. HBase is offered as a managed cluster integrated into the Azure environment. The clusters are configured to store data directly in Azure Blob storage that provides low latency and elasticity between performance and cost. This enables customers to build interactive websites that work with large datasets, to build services that store sensor and telemetry data from millions of end points, and to analyze this data with Hadoop jobs.

 

How To Create a HBase cluster

To try HBase during the preview, PowerShell should be leveraged.

1. Install Windows Azure PowerShell

2. Setup Environment

3. Capture cluster credentials in a variable

PS C:\> $creds = Get-Credential

4. Create HBase cluster:

PS C:\> New-AzureHDInsightCluster -Name yourclustername -ClusterType HBase -Version 3.0 -Location “West US” `

-DefaultStorageAccountName yourstorageaccount.blob.core.windows.net -DefaultStorageAccountKey “yourstorageaccountkey” `

-DefaultStorageContainerName hbasecontainername -Credential $creds -ClusterSizeInNodes 4

 

Manipulating Data in HBase Cluster

Application developers can access HBase data through REST APIs, HBase shell and different types of map reduce jobs like Hive and Pig. HBase shell provides interactive console to manage HBase cluster, create and drop tables and manipulate data in them.

1. To open HBase shell first enable RDP connection to the cluster and connect to it

After the cluster is created it will appear in the Azure Portal under HDInsight service

Open the CONFIGURATION tab of the cluster.

Click on the ENABLE REMOTE button at the bottom of the page to enable the RDP connection to the cluster.

Click on the CONNECT button at the bottom of the CONFIGURATION tab.

clip_image002

 

2. Open the HBase Shell

Within your RDP session, click on the Hadoop command prompt shortcut located on the desktop.

Open the HBase shell:

cd %HBASE_HOME%\bin

hbase shell

 

3. Create a sample table, add a row to the table and list the rows in the table:

create ‘sampletable’, ‘cf1′

put ‘sampletable’, ‘row1′, ‘cf1:col1′, ‘value1′

scan ‘sampletable’

 

For more information on HBase and HDInsight, read: