Introducing: Elasticsearch with Azure File storage

By Hans Krijger Senior Software Engineer, Azure Linux

Introducing: Elasticsearch with Azure File storage • 4 min read

Posted on February 18, 2016
4 min read

The Azure Quickstart Templates are a great resource for getting started with template based deployments for many different technologies, including Elasticsearch. Just recently we made some improvements to the Elasticsearch template that enable you to create a pre-configured Elasticsearch cluster which stores data on Azure File storage, and provides you with the option of installing plugins like Sense, Marvel and Kibana, all in just a few minutes.

If you are not familiar with Azure Files, this service offers shared storage using the SMB protocol via mounted shares. This storage is accessible in the same region to any number of virtual machines or roles, and is supported in both Windows and Linux. For more information, take a look at this introduction to Azure File storage.

Since mounting and un-mounting shares is quick and can be done while the system is live, Azure Files give us a way to easily decouple compute and storage, which opens up a lot of exciting possibilities with a technology like Elasticsearch. In particular, shadow replicas indices allow us to make use of this shared filesystem with several attractive optimizations:

Only a single copy of the data is indexed; replicas are replaced with shadow replicas
Recovery consists of another node taking ownership of a shard, rather than creating another copy
Rebalancing or redistributing data across more or fewer nodes becomes a lightweight operation

Trying out this new functionality is trivial with the improved Azure Resource Manager template. In this article we will go through a simple example to get you up and running with Elasticsearch on Azure File storage.

Deployment

After selecting Deploy to Azure in the Elasticsearch template, you will need to provide some parameters to the deployment. If unsure, use the default values to get started.

Introducing: Elasticsearch with Azure File storage

Some of the parameters are explained in more detail below.

Operating system: Azure File support in the template is only in Ubuntu today, but will exist for Windows soon
Resource group: A logical grouping within your subscription for a collection of related resources; typically, you provide a memorable name here
Jumpbox: Provides an entry point into the cluster via ssh (Ubuntu) or rdp (Windows); use this unless you are deploying to an ExpressRoute subscription
Node sizes: The default sizes are good options for getting started, but make sure you deploy enough cores for your workload
Version: It is recommended you use the latest version available (currently 2.2.0) for Azure File support
AFS: Select this option for Azure File storage; Note, this option is currently only valid for Ubuntu and Elasticsearch 2.x
Template base: When deploying from the Azure repo, this does not need to be changed; if deploying from your own fork, update this to your repo URI
Kibana and Sense: Selecting both of these provides an easy way to access and interact with your cluster via a public IP

Accessing your cluster

Once the deployment is complete, you can find the Kibana URL and Jumpbox IP in the deployment outputs for the specified resource group in the Azure portal. Select the resource group, then the Last deployment link under Essentials to find the Kibana and Jumpbox deployment outputs.

Introducing: Elasticsearch with Azure File storage

Simply use the right-hand buttons to copy the Kibana URL to the clipboard and paste it into a browser window to bring up Kibana. Provided the Sense option was selected at deployment time, you will be able to switch to the Sense app via the Kibana interface.

Introducing: Elasticsearch with Azure File storage In order to execute Sense commands, the correct IP will need to be set. The master nodes use static IPs, so replace localhost with 10.0.0.10, or any other private IP from the cluster, which can be found under the list of resources in the portal.

Introducing: Elasticsearch with Azure File storage

Creating a shadow replica index

When using the AFS option in the template, the elasticsearch.yml settings contain the following:

node.enable_custom_paths: true
node.add_id_to_custom_path: false
path.shared_data: /datadisks/esdata00

This means data on local storage is stored in the default location, and shared filesystem data should be stored under the path /datadisks/esdata00. When creating a shadow replica index therefore, you should use index settings like this:

put my-shared-index
{
  "index": {
    "data_path": "/datadisks/esdata00",
    "shadow_replicas": true,
    "shared_filesystem.recover_on_any_node": true
  }
}

With these settings, the index will be created in the correct location (under the shared data path), and the index will have shadow replicas. The setting to recover a shard on any node is important, as it instructs Elasticsearch to not wait for a node to rejoin the cluster before recovering its shards from the shared filesystem. You can find more information about shadow replicas in the Elasticsearch documentation.

Next steps

In this article you saw how to quickly and easily deploy a fully pre-configured Elasticsearch cluster, including optional tools like Marvel, Sense and Kibana. You also learned how to access this cluster and create a shadow replica based index on the shared storage provided by Azure Files. In order to move from the Development phase into Production deployments, you might be interested in the following topics.

Customizing your deployment by authoring your own Azure Resource Manager templates
Elasticsearch on Azure
Using Azure File storage with Linux

In addition, we are working closely with the Elastic team on providing the tools and guidance for Elasticsearch at scale on Azure File storage. We will be publishing more information about this in the near future, so stay tuned!

Related posts

Enabling Diagnostic Logging in Azure API for FHIR®

Durch IRAP-geschützte Compliance von der Infrastruktur in die SAP-Anwendungsschicht in Azure

MileIQ and Azure Event Hubs: Billions of miles streamed

Azure Stack IaaS – part ten

Join the conversation

Vorgestellt

KI + Machine Learning

Analysen

Compute

Container

Datenbanken

DevOps

Entwicklungstools

Hybrid Cloud und Multi Cloud

Identität

Integration

Internet der Dinge

Verwaltung und Governance

Medien

Migration

Mixed Reality

Mobil

Netzwerk

Sicherheit

Speicher

Web

Windows Virtual Desktop

Anwendungsfälle

Anwendungsbereitstellung

KI

Cloudmigration und -modernisierung

Daten und Analysen

Hybrid Cloud und Infrastruktur

Internet der Dinge

Sicherheit und Governance

Organisationstyp

Ressourcen

Deployment

Accessing your cluster

Creating a shadow replica index

Next steps

Explore

Related posts

Join the conversation