Transitioning big data workloads to the cloud: Best practices from Unravel Data

Posted on January 31, 2019

Principal Program Manager, Azure HDInsight

Migrating on-premises Apache Hadoop® and Spark workloads to the cloud remains a key priority for many organizations. In my last post, I shared “Tips and tricks for migrating on-premises Hadoop infrastructure to Azure HDInsight.” In this series, one of HDInsight’s partners, Unravel Data, will share their learnings, best practices, and guidance based on their insights from helping migrate many on-premises Hadoop and Spark deployments to the cloud.

Unravel Data is an AI-driven Application Performance Management (APM) solution for managing and optimizing big data workloads. Unravel Data provides a unified, full-stack view of apps, resources, data, and users, enabling users to baseline and manage app performance and reliability, control costs and SLAs proactively, and apply automation to minimize support overhead. Ops and Dev teams use Unravel Data’s unified capability for on-premises workloads and to plan, migrate, and operate workloads on Azure. Unravel Data is available on the HDInsight Application Platform.

Today’s post, which kicks off the five-part series, comes from Shivnath Babu, CTO and Co-Founder at Unravel Data. This blog series will discuss key considerations in planning for migrations. Upcoming posts will outline the best practices for the migration, operation, and optimization phases of the cloud adoption lifecycle for big data.

Unravel Data’s perspective on migration planning

The cloud is helping to accelerate big data adoption across the enterprise. But while this provides the potential for much greater scalability, flexibility, optimization, and lower costs for big data, there are certain operational and visibility challenges that exist on-premises that don’t disappear once you’ve migrated workloads away from your data center.

Time and time again, we have experienced situations where migration is oversimplified and considerations such as application dependencies and system version mapping are not given due attention. This results in cost overruns through over-provisioning or production delays through provisioning gaps.

Businesses today are powered by modern data applications that rely on a multitude of platforms. These organizations desperately need a unified way to understand, plan, optimize, and automate the performance of their modern data apps and infrastructure. They need a solution that will allow them to quickly and intelligently resolve performance issues for any system through full-stack observability and AI-driven automation. Only then can these organizations keep up as the business landscape continues to evolve, and be certain that big data investments are delivering on their promises.

Current challenges in big data

Today, IT uses many disparate technologies and siloed approaches to manage the various aspects of their modern data apps and big data infrastructure.

Many existing monitoring solutions often do not provide end-to-end support for big data environments, lack full-stack compatibility, or require complex instrumentation. This includes configuration changes to applications and their components, which requires deep subject matter expertise. The murky soup of monitoring solutions that organizations currently rely on doesn’t deliver the application agility that is required by the business.

Consequently, this results in poor user experience, inefficiencies and mounting costs as organizations buy more and more tools to solve these problems and then have to spend additional resources managing and maintaining those tools.

Additionally, organizations see a high Mean Time to Identify (MTTI) and Mean Time to Resolve (MTTR) issues because it is hard to understand the dependencies and keep focused on root cause analysis. The lack of granularity and end to end visibility makes it impossible to remedy all of these problems, and businesses are stuck in a state of limbo.

It’s not an option to continue doing what was done in the past. Teams need a detailed appreciation of what they are doing today, what gaps they still have, and what steps they can take to improve business outcomes. It’s not uncommon to see 10x or more improvements in root cause analysis and remediation times for customers who are able to gain a deep understanding of the current state of their big data strategy and make a plan for where they need to be.

Starting your big data journey to the cloud

Without a unified APM platform, the challenges only intensify as enterprises move big data to the cloud. Cloud adoption is not a finite process with a clear start and end date — it’s an ongoing lifecycle with four broad phases (planning, migration, operation, and optimization). Below, we briefly discuss some of the key challenges and questions that arise for organizations below, which we will dive into in further detail in subsequent posts.

In the planning phase, key questions may include:

  • “Which apps are best suited for a move to the cloud?”
  • “What are the resource requirements?
  • “How much disk, compute, and memory am I using today?”
  • “What do I need over the next 3, 6, 9, and 12 months?”
  • “Which datasets should I migrate?”
  • “Should I use permanent, transient, autoscaling, or spot instances?”

During migration, which can be a long running process as workloads are iteratively moved, there is a need for continuous monitoring of performance and costs. Key questions may include:

  • “Is the migration successful?”
  • “How does the performance compare to on-premises?”
  • “Have I correctly assessed all the critical dependencies and service mapping?”

Once workloads are in production on the cloud, key considerations include:

  • “How do I continue to optimize for cost and for performance to guarantee SLAs?”
  • “How do I ensure Ops teams are as efficient and as automated as possible?”
  • “How do I empower application owners to leverage self-service to solve their own issues easily to improve agility?”

The challenges of managing disparate big data technologies both on-premise and in the cloud can be solved with a comprehensive approach to operational planning. In this blog series, we will dive deeper into each stage of the cloud adoption lifecycle and provide practical advice for every part of the journey. Upcoming posts will outline the best practices for the planning, migration, operation, and optimization phases of this lifecycle.

About HDInsight application platform

The HDInsight application platform provides a one-click deployment experience for discovering and installing popular applications from the big data ecosystem. The applications cater to a variety of scenarios such as data ingestion, data preparation, data management, cataloging, lineage, data processing, analytical solutions, business intelligence, visualization, security, governance, data replication, and many more. The applications are installed on edge nodes which are created within the same Azure Virtual Network boundary as the other cluster nodes so you can access these applications in a secure manner.

Additional resources