Performs ETL job using Azure services

Last updated: 3/4/2020

This template provides an example of how to perform analytics on the historic as well as real time streaming data stored in Azure Blob Storage. The data from the event hub is taken by the Azure Stream Analytics Job to perform transformation and the output is stored in Azure Blob Storage and is visualized in PowerBI. The analytics is applied on the historic data stored in Blob storage by Azure Data Analytics and the movement of extracted, transformed and published data and the orchestration is done by Data Factory. The published data is further visualized in PowerBI

This Azure Resource Manager template was created by a member of the community and not by Microsoft. Each Resource Manager template is licensed to you under a license agreement by its owner, not Microsoft. Microsoft is not responsible for Resource Manager templates provided and licensed by community members and does not screen for security, compatibility, or performance. Community Resource Manager templates are not supported under any Microsoft support program or service, and are made available AS IS without warranty of any kind.

Parameters

Parameter Name Description
location The location in which the resources will be created.Check supported locations
eventHubNamespaceName Name of the EventHub namespace
captureTime the time window in seconds for the archival
captureSize the size window in bytes for event hub capture
eventhubSku The messaging tier for service Bus namespace
skuCapacity MessagingUnits for premium namespace
isAutoInflateEnabled Enable or disable AutoInflate
messageRetentionInDays How long to retain the data in Event Hub
partitionCount Number of partitions chosen
captureEncodingFormat The encoding format Eventhub capture serializes the EventData when archiving to your storage
adlAnalyticsName The name of the Data Lake Analytics account to create.
adlStoreName The name of the Data Lake Store account to create.
vmSize Size of vm Eg. Standard_D1_v2
vm_username Username for the Virtual Machine.
vm_password Password for the Virtual Machine.
OptionalWizardInstall Select whether the VM should be in production or not.
dataFactoryName Name of the data factory. Must be globally unique.
appName Name of the Azure datalake UI app registered. Must be globally unique.
servicePrincipalId The ID of the service principal that has permissions to create HDInsight clusters in your subscription.
servicePrincipalKey The access key of the service principal that has permissions to create HDInsight clusters in your subscription.
dataLakeAnalyticsLocation The location in which the resources will be created.Check supported locations
_artifactsLocation The base URI where artifacts required by this template are located here
_artifactsLocationSasToken The sasToken required to access _artifactsLocation. When the template is deployed using the accompanying scripts, a sasToken will be automatically generated. Use the defaultValue if the staging location is not secured.

Use the template

PowerShell

New-AzResourceGroup -Name <resource-group-name> -Location <resource-group-location> #use this command when you need to create a new resource group for your deployment
New-AzResourceGroupDeployment -ResourceGroupName <resource-group-name> -TemplateUri https://raw.githubusercontent.com/Azure/azure-quickstart-templates/master/yash-datalake/azuredeploy.json
Install and configure Azure PowerShell

Command line

az group create --name <resource-group-name> --location <resource-group-location> #use this command when you need to create a new resource group for your deployment
az group deployment create --resource-group <my-resource-group> --template-uri https://raw.githubusercontent.com/Azure/azure-quickstart-templates/master/yash-datalake/azuredeploy.json
Install and Configure the Azure Cross-Platform Command-Line Interface