WANdisco enables continuous data replication on Azure HDInsight for Big Data applications

Publicado el 27 septiembre, 2017

Program Manager, Azure, Big Data

We are pleased to announce the expansion of HDInsight Application Platform to include WANdisco. You can install the WANdisco Fusion app and take advantage of the free trial too.

Azure HDInsight is the industry leading fully-managed cloud Apache Hadoop and Spark offering, which gives you optimized open-source analytic clusters for Spark, Hive, MapReduce, HBase, Storm, Kafka, and Microsoft R Server, backed by a 99.9% SLA.

WANdisco Fusion provides continuous replication of selected data at scale between multiple Big Data and cloud environments. With guaranteed data consistency and continuous availability, Microsoft Azure HDInsight customers will now have easy access to the cost-saving benefits of Fusion’s hybrid architecture for on-demand data analytics and offsite disaster recovery.

This combined offering of WANDisco on Azure HDInsight enables customers to connect their Big Data applications from on-premise to HDInsight and expand their analytical footprint faster. Customers can use more open source workloads and libraries easily in the cloud, since they can create clusters on demand and run them against the data that was replicated by WANdisco.

To learn more please come to our presentation Extend on-premises Hadoop and Spark deployments across data centers and the cloud, including Microsoft Azure with Pranav Rastogi, Program Manager, Microsoft and Jagane Sundar, Chief Technology Officer, WANdisco at Strata Data Conference New York on Thursday, September 28, 2017 at 1:15 PM in room 1A03. To find out more, please visit the Strata Data Conference website.

The engineering teams are also hosting a webinar where they will discuss this offering in detail. Please join us by registering today.

Microsoft Azure HDInsight – Reliable Open Source Analytics at Enterprise grade and scale

Azure HDInsight is the only fully-managed cloud Hadoop offering that provides optimized open source analytical clusters for Spark, Hive, Interactive Hive, MapReduce, HBase, Storm, Kafka, and R Server backed by a 99.9% SLA. Each of these Big Data technologies are easily deployable as managed clusters, with enterprise-level security and monitoring.

The ecosystem of productivity applications in Big Data has grown immensely to help customers be more productive with their Big Data solutions. Today, customers often find it challenging to discover these productivity applications, and then in-turn struggle to install and configure these apps.

To address this gap, the HDInsight Application Platform provides a unique experience to HDInsight where Independent Software Vendors (ISV’s) can directly offer their applications to customers. Customers can now easily discover, install and use these applications built for the Big Data ecosystem by a single click.

Setting up a hybrid environment for Big Data scenarios has always been a huge challenge since customers had to replicate petabytes of data and keep both environments in sync. To help customers connect their on-premise Big Data environments with HDInsight, WANdisco Fusion can be deployed as an HDInsight application.

WANdisco Fusion on Azure HDInsight – Move petabyte scale data from on-premises Big Data deployments to Azure

The integration of WANdisco Fusion with Azure HDInsight presents an enterprise solution that enables organizations to meet stringent data availability and compliance requirements whilst seamlessly moving production data at petabyte scale from on-premises big data deployments to Microsoft Azure.

As customers start moving parts of their Big Data applications to Azure, it would give them the flexibility of experimenting with advanced analytical offerings such as running R Server on HDInsight, and more open source machine learning libraries to use. Traditionally experimenting with them on an on-premise Hadoop deployment has been hard due to IT and hardware procurement, but the cloud effectiveness of HDInsight where you can spin up clusters, scale and delete them on demand, allows you to easily experiment in the cloud. Once you have done your analysis, you can then determine how much of your Big Data deployment should you migrate to the cloud.

Customers can use Fusion for the following scenarios:

  • Hybrid cloud setup for Big data applications: Connect on-premises Big Data deployments to HDInsight. You can setup replication from any Hadoop or Spark distribution running any open source workload (Hive, Spark, HBase, and more)
  • Multi-cloud: Connect any Big Data deployment running in any cloud to Azure HDInsight
  • Multi-region replication for back-up and disaster recovery


The following are some of the key benefits of Fusion on HDInsight which help customers

  • Continuous data replication: Data is replicated as soon as changes occur, regardless of where those changes are initiated, with guaranteed consistency
  • Opt-in backup: An administrator can select subsets of content for replication, with fine-grained control over where data resides
  • No administrator overhead: Replication is continuous and automatic, recovering from intermittent network or system failures automatically so that the need for administration oversight is eliminated

Getting started with Fusion on HDInsight

Installing Fusion is a two-step process. This will configure Fusion server, and the client libraries required on the cluster.

Install Fusion server: This will install the Fusion server in the same Azure Virtual Network as the HDInsight cluster. This allows the server to access the cluster in a secure manner.

wandiscofusion

Install the Fusion app on a new HDInsight cluster or an existing cluster. In the License key field, enter the Public IP of the Fusion Server

wandiscohdinsight

 

After you have installed Fusion on HDInsight, you can follow the user guide to setup continuous active replication from on-premises Big Data deployments to Azure HDInsight, multi-region replication, backup and restore, and more.

Strata Presentation and Webinar

To learn more, please come to our presentation Extend on-premises Hadoop and Spark deployments across data centers and the cloud, including Microsoft Azure with Pranav Rastogi, Program Manager, Microsoft and Jagane Sundar, Chief Technology Officer, WANdisco at Strata Data Conference New York on Thursday, September 28, 2017 at 1:15 PM in room 1A03. To find out more, please visit the Strata Data Conference website.

The engineering teams are also hosting a webinar where they will discuss this offering in detail. Please join us by registering today.

Resources

Summary

We are pleased to announce the expansion of HDInsight Application Platform to include WANdisco. This combined offering of WANDisco on Azure HDInsight enables customers to connect their Big Data applications from on-premises to HDInsight in the cloud faster. Please visit us at the Strata session and register for the upcoming webinar to learn more.