{"id":1737,"date":"2019-01-31T00:00:00","date_gmt":"2019-01-31T00:00:00","guid":{"rendered":"https:\/\/azure.microsoft.com\/blog\/transitioning-big-data-workloads-to-the-cloud-best-practices-from-unravel-data"},"modified":"2025-06-12T05:29:21","modified_gmt":"2025-06-12T12:29:21","slug":"transitioning-big-data-workloads-to-the-cloud-best-practices-from-unravel-data","status":"publish","type":"post","link":"https:\/\/azure.microsoft.com\/en-us\/blog\/transitioning-big-data-workloads-to-the-cloud-best-practices-from-unravel-data\/","title":{"rendered":"Transitioning big data workloads to the cloud: Best practices from Unravel Data"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">Migrating on-premises Apache Hadoop\u00ae and Spark workloads to the cloud remains a key priority for many organizations. In my last post, I shared \u201c<a href=\"https:\/\/azure.microsoft.com\/en-us\/blog\/migrating-on-premises-hadoop-infrastructure-to-azure-hdinsight\/\" target=\"_blank\" rel=\"noopener\">Tips and tricks for migrating on-premises Hadoop infrastructure to Azure HDInsight<\/a>.\u201d In this series, one of HDInsight\u2019s partners,\u202f<a href=\"https:\/\/azuremarketplace.microsoft.com\/en-us\/marketplace\/apps\/unravel-data.unravel-app\" target=\"_blank\" rel=\"noopener\">Unravel Data<\/a>, will share their learnings, best practices, and guidance based on their insights from helping migrate many on-premises Hadoop and Spark deployments to the cloud.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Unravel Data is an AI-driven Application Performance Management (APM) solution for managing and optimizing big data workloads. Unravel Data provides a unified, full-stack view of apps, resources, data, and users, enabling users to baseline and manage app performance and reliability, control costs and SLAs proactively, and apply automation to minimize support overhead. Ops and Dev teams use Unravel Data\u2019s unified capability for on-premises workloads and to plan, migrate, and operate workloads on Azure. Unravel Data is available on the\u202fHDInsight Application Platform.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Today\u2019s post, which kicks off the five-part series, comes from Shivnath Babu, CTO and Co-Founder at Unravel Data. This blog series will discuss key considerations in planning for migrations. Upcoming posts will outline the best practices for the migration, operation, and optimization phases of the cloud adoption lifecycle for <a href=\"https:\/\/azure.microsoft.com\/en-us\/resources\/cloud-computing-dictionary\/what-is-big-data-analytics\" target=\"_blank\" rel=\"noopener\">big data<\/a>.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"unravel-data-s-perspective-on-migration-planning\">Unravel Data\u2019s perspective on migration planning<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The cloud is helping to accelerate big data adoption across the enterprise. But while this provides the potential for much greater scalability, flexibility, optimization, and lower costs for big data, there are certain operational and visibility challenges that exist on-premises that don\u2019t disappear once you\u2019ve migrated workloads away from your data center.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Time and time again, we have experienced situations where migration is oversimplified and considerations such as application dependencies and system version mapping are not given due attention. This results in cost overruns through over-provisioning or production delays through provisioning gaps.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Businesses today are powered by modern data applications that rely on a multitude of platforms. These organizations desperately need a unified way to understand, plan, optimize, and automate the performance of their modern data apps and infrastructure. They need a solution that will allow them to quickly and intelligently resolve performance issues for any system through full-stack observability and AI-driven automation. Only then can these organizations keep up as the business landscape continues to evolve, and be certain that big data investments are delivering on their promises.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"current-challenges-in-big-data\">Current challenges in big data<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Today, IT uses many disparate technologies and siloed approaches to manage the various aspects of their modern data apps and big data infrastructure.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Many existing monitoring solutions often do not provide end-to-end support for big data environments, lack full-stack compatibility, or require complex instrumentation. This includes configuration changes to applications and their components, which requires deep subject matter expertise. The murky soup of monitoring solutions that organizations currently rely on doesn\u2019t deliver the application agility that is required by the business.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Consequently, this results in poor user experience, inefficiencies and mounting costs as organizations buy more and more tools to solve these problems and then have to spend additional resources managing and maintaining those tools.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Additionally, organizations see a high Mean Time to Identify (MTTI) and Mean Time to Resolve (MTTR) issues because it is hard to understand the dependencies and keep focused on root cause analysis. The lack of granularity and end to end visibility makes it impossible to remedy all of these problems, and businesses are stuck in a state of limbo.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">It\u2019s not an option to continue doing what was done in the past. Teams need a detailed appreciation of what they are doing today, what gaps they still have, and what steps they can take to improve business outcomes. It\u2019s not uncommon to see 10x or more improvements in root cause analysis and remediation times for customers who are able to gain a deep understanding of the current state of their big data strategy and make a plan for where they need to be.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"starting-your-big-data-journey-to-the-cloud\">Starting your big data journey to the cloud<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Without a unified APM platform, the challenges only intensify as enterprises move big data to the cloud. Cloud adoption is not a finite process with a clear start and end date \u2014 it\u2019s an ongoing lifecycle with four broad phases (planning, migration, operation, and optimization). Below, we briefly discuss some of the key challenges and questions that arise for organizations below, which we will dive into in further detail in subsequent posts.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In the\u202f<strong>planning\u202f<\/strong>phase, key questions may include:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">\u201cWhich apps are best suited for a move to the cloud?\u201d<\/li>\n\n\n\n<li class=\"wp-block-list-item\">\u201cWhat are the resource requirements?<\/li>\n\n\n\n<li class=\"wp-block-list-item\">\u201cHow much disk, compute, and memory am I using today?\u201d<\/li>\n\n\n\n<li class=\"wp-block-list-item\">\u201cWhat do I need over the next 3, 6, 9, and 12 months?\u201d<\/li>\n\n\n\n<li class=\"wp-block-list-item\">\u201cWhich datasets should I migrate?\u201d<\/li>\n\n\n\n<li class=\"wp-block-list-item\">\u201cShould I use permanent, transient, autoscaling, or spot instances?\u201d<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">During\u202f<strong>migration<\/strong>,\u202fwhich can be a long running process as workloads are iteratively moved, there is a need for continuous monitoring of performance and costs. Key questions may include:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">\u201cIs the migration successful?\u201d<\/li>\n\n\n\n<li class=\"wp-block-list-item\">\u201cHow does the performance compare to on-premises?\u201d<\/li>\n\n\n\n<li class=\"wp-block-list-item\">\u201cHave I correctly assessed all the critical dependencies and service mapping?\u201d<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">Once workloads are<strong>\u202fin production\u202f<\/strong>on the cloud, key considerations include:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">\u201cHow do I continue to optimize for cost and for performance to guarantee SLAs?\u201d<\/li>\n\n\n\n<li class=\"wp-block-list-item\">\u201cHow do I ensure Ops teams are as efficient and as automated as possible?\u201d<\/li>\n\n\n\n<li class=\"wp-block-list-item\">\u201cHow do I empower application owners to leverage self-service to solve their own issues easily to improve agility?\u201d<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">The challenges of managing disparate big data technologies both on-premise and in the cloud can be solved with a comprehensive approach to operational planning. In this blog series, we will dive deeper into each stage of the cloud adoption lifecycle and provide practical advice for every part of the journey. Upcoming posts will outline the best practices for the planning, migration, operation, and optimization phases of this lifecycle.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"about-hdinsight-application-platform\">About HDInsight application platform<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The\u202fHDInsight application platform\u202fprovides a one-click deployment experience for discovering and installing popular applications from the big data ecosystem. The applications cater to a variety of scenarios such as data ingestion, data preparation, data management, cataloging, lineage, data processing, analytical solutions, business intelligence, visualization, security, governance, data replication, and many more. The applications are installed on edge nodes which are created within the same Azure Virtual Network boundary as the other cluster nodes so you can access these applications in a secure manner.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"additional-resources\">Additional resources<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">Learn more about\u202f<a href=\"https:\/\/docs.microsoft.com\/en-us\/azure\/hdinsight\/\" target=\"_blank\" rel=\"noopener\">Azure HDInsight<\/a><\/li>\n\n\n\n<li class=\"wp-block-list-item\"><a href=\"https:\/\/docs.microsoft.com\/en-us\/azure\/hdinsight\/hadoop\/apache-hadoop-on-premises-migration-motivation?toc=%2Fen-us%2Fazure%2Fhdinsight%2Fhadoop%2FTOC.json&amp;bc=%2Fen-us%2Fazure%2Fbread%2Ftoc.json\" target=\"_blank\" rel=\"noopener\">Migrate on-premises Apache Hadoop clusters to Azure HDInsight<\/a><\/li>\n\n\n\n<li class=\"wp-block-list-item\"><a href=\"https:\/\/azure.microsoft.com\/en-us\/blog\/get-up-to-speed-with-azure-hdinsight-the-comprehensive-guide\/\" target=\"_blank\" rel=\"noopener\">Get up to speed with Azure HDInsight: The comprehensive guide<\/a><\/li>\n\n\n\n<li class=\"wp-block-list-item\"><a href=\"https:\/\/docs.microsoft.com\/en-us\/azure\/hdinsight\/hdinsight-component-versioning\" target=\"_blank\" rel=\"noopener\">Open Source component guide on HDInsight<\/a><\/li>\n\n\n\n<li class=\"wp-block-list-item\"><a href=\"https:\/\/docs.microsoft.com\/en-us\/azure\/hdinsight\/hdinsight-release-notes\" target=\"_blank\" rel=\"noopener\">HDInsight release notes<\/a><\/li>\n\n\n\n<li class=\"wp-block-list-item\"><a href=\"https:\/\/docs.microsoft.com\/en-us\/azure\/hdinsight\/hdinsight-release-notes\" target=\"_blank\" rel=\"noopener\">Ask HDInsight questions on\u202fMSDN forums<\/a><\/li>\n\n\n\n<li class=\"wp-block-list-item\"><a href=\"https:\/\/social.msdn.microsoft.com\/forums\/azure\/en-us\/home?forum=hdinsight\" target=\"_blank\" rel=\"noopener\">Ask HDInsight questions on\u202fStackOverflow<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Migrating on-premises Apache Hadoop\u00ae and Spark workloads to the cloud remains a key priority for many organizations. In my last post, I shared \u201cTips and tricks for migrating on-premises Hadoop infrastructure to Azure HDInsight.<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"ms_queue_id":[],"ep_exclude_from_search":false,"_classifai_error":"","_classifai_text_to_speech_error":"","_alt_title":"","footnotes":"","msx_community_cta_settings":[]},"categories":[1474],"tags":[48],"audience":[3054,3057,3053],"content-type":[1511],"product":[2895],"tech-community":[],"topic":[],"coauthors":[97],"class_list":["post-1737","post","type-post","status-publish","format-standard","hentry","category-analytics","tag-big-data","audience-business-decision-makers","audience-data-professionals","audience-it-decision-makers","content-type-best-practices","product-azure-hdinsight-on-azure-kubernetes-service-aks","review-flag-3-1680286581-173","review-flag-6-1680286581-909","review-flag-9-1680286581-259","review-flag-ai-driven-ai-driven-2","review-flag-lever-1680286579-649","review-flag-on-pr-1680286585-498"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.2 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Transitioning big data workloads to the cloud: Best practices from Unravel Data | Microsoft Azure Blog<\/title>\n<meta name=\"description\" content=\"Migrating on-premises Apache Hadoop\u00ae and Spark workloads to the cloud remains a key priority for many organizations. In my last post, I shared \u201cTips and tricks for migrating on-premises Hadoop infrastructure to Azure HDInsight.\u201d\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/azure.microsoft.com\/en-us\/blog\/transitioning-big-data-workloads-to-the-cloud-best-practices-from-unravel-data\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Transitioning big data workloads to the cloud: Best practices from Unravel Data | Microsoft Azure Blog\" \/>\n<meta property=\"og:description\" content=\"Migrating on-premises Apache Hadoop\u00ae and Spark workloads to the cloud remains a key priority for many organizations. In my last post, I shared \u201cTips and tricks for migrating on-premises Hadoop infrastructure to Azure HDInsight.\u201d\" \/>\n<meta property=\"og:url\" content=\"https:\/\/azure.microsoft.com\/en-us\/blog\/transitioning-big-data-workloads-to-the-cloud-best-practices-from-unravel-data\/\" \/>\n<meta property=\"og:site_name\" content=\"Microsoft Azure Blog\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/microsoftazure\" \/>\n<meta property=\"article:published_time\" content=\"2019-01-31T00:00:00+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-06-12T12:29:21+00:00\" \/>\n<meta name=\"author\" content=\"Microsoft Azure\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@azure\" \/>\n<meta name=\"twitter:site\" content=\"@azure\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Microsoft Azure\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"5 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/transitioning-big-data-workloads-to-the-cloud-best-practices-from-unravel-data\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/transitioning-big-data-workloads-to-the-cloud-best-practices-from-unravel-data\/\"},\"author\":[{\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/author\/microsoft-azure\/\",\"@type\":\"Person\",\"@name\":\"Microsoft Azure\"}],\"headline\":\"Transitioning big data workloads to the cloud: Best practices from Unravel Data\",\"datePublished\":\"2019-01-31T00:00:00+00:00\",\"dateModified\":\"2025-06-12T12:29:21+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/transitioning-big-data-workloads-to-the-cloud-best-practices-from-unravel-data\/\"},\"wordCount\":1115,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/#organization\"},\"keywords\":[\"Big Data\"],\"articleSection\":[\"Analytics\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/azure.microsoft.com\/en-us\/blog\/transitioning-big-data-workloads-to-the-cloud-best-practices-from-unravel-data\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/transitioning-big-data-workloads-to-the-cloud-best-practices-from-unravel-data\/\",\"url\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/transitioning-big-data-workloads-to-the-cloud-best-practices-from-unravel-data\/\",\"name\":\"Transitioning big data workloads to the cloud: Best practices from Unravel Data | Microsoft Azure Blog\",\"isPartOf\":{\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/#website\"},\"datePublished\":\"2019-01-31T00:00:00+00:00\",\"dateModified\":\"2025-06-12T12:29:21+00:00\",\"description\":\"Migrating on-premises Apache Hadoop\u00ae and Spark workloads to the cloud remains a key priority for many organizations. In my last post, I shared \u201cTips and tricks for migrating on-premises Hadoop infrastructure to Azure HDInsight.\u201d\",\"breadcrumb\":{\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/transitioning-big-data-workloads-to-the-cloud-best-practices-from-unravel-data\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/azure.microsoft.com\/en-us\/blog\/transitioning-big-data-workloads-to-the-cloud-best-practices-from-unravel-data\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/transitioning-big-data-workloads-to-the-cloud-best-practices-from-unravel-data\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Blog home\",\"item\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Analytics\",\"item\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/category\/analytics\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"Transitioning big data workloads to the cloud: Best practices from Unravel Data\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/#website\",\"url\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/\",\"name\":\"Microsoft Azure Blog\",\"description\":\"Get the latest Azure news, updates, and announcements from the Azure blog. From product updates to hot topics, hear from the Azure experts.\",\"publisher\":{\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/#organization\",\"name\":\"Microsoft Azure Blog\",\"url\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2024\/06\/microsoft_logo.webp\",\"contentUrl\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2024\/06\/microsoft_logo.webp\",\"width\":512,\"height\":512,\"caption\":\"Microsoft Azure Blog\"},\"image\":{\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/microsoftazure\",\"https:\/\/x.com\/azure\",\"https:\/\/www.instagram.com\/microsoftdeveloper\/\",\"https:\/\/www.linkedin.com\/company\/16188386\",\"https:\/\/www.youtube.com\/user\/windowsazure\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/#\/schema\/person\/c702e5edd662b328b49b7e1180cab117\",\"name\":\"shakir\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/secure.gravatar.com\/avatar\/9342c7c05bb16548741bc5cd3a3e3b7ee0c8e746844ad2cc582db5beb5514c6f?s=96&d=mm&r=g7664e653ea371ce16eaf75e9fa8952c4\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/9342c7c05bb16548741bc5cd3a3e3b7ee0c8e746844ad2cc582db5beb5514c6f?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/9342c7c05bb16548741bc5cd3a3e3b7ee0c8e746844ad2cc582db5beb5514c6f?s=96&d=mm&r=g\",\"caption\":\"shakir\"},\"sameAs\":[\"https:\/\/azure.microsoft.com\"],\"url\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/author\/shakir\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Transitioning big data workloads to the cloud: Best practices from Unravel Data | Microsoft Azure Blog","description":"Migrating on-premises Apache Hadoop\u00ae and Spark workloads to the cloud remains a key priority for many organizations. In my last post, I shared \u201cTips and tricks for migrating on-premises Hadoop infrastructure to Azure HDInsight.\u201d","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/azure.microsoft.com\/en-us\/blog\/transitioning-big-data-workloads-to-the-cloud-best-practices-from-unravel-data\/","og_locale":"en_US","og_type":"article","og_title":"Transitioning big data workloads to the cloud: Best practices from Unravel Data | Microsoft Azure Blog","og_description":"Migrating on-premises Apache Hadoop\u00ae and Spark workloads to the cloud remains a key priority for many organizations. In my last post, I shared \u201cTips and tricks for migrating on-premises Hadoop infrastructure to Azure HDInsight.\u201d","og_url":"https:\/\/azure.microsoft.com\/en-us\/blog\/transitioning-big-data-workloads-to-the-cloud-best-practices-from-unravel-data\/","og_site_name":"Microsoft Azure Blog","article_publisher":"https:\/\/www.facebook.com\/microsoftazure","article_published_time":"2019-01-31T00:00:00+00:00","article_modified_time":"2025-06-12T12:29:21+00:00","author":"Microsoft Azure","twitter_card":"summary_large_image","twitter_creator":"@azure","twitter_site":"@azure","twitter_misc":{"Written by":"Microsoft Azure","Est. reading time":"5 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/transitioning-big-data-workloads-to-the-cloud-best-practices-from-unravel-data\/#article","isPartOf":{"@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/transitioning-big-data-workloads-to-the-cloud-best-practices-from-unravel-data\/"},"author":[{"@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/author\/microsoft-azure\/","@type":"Person","@name":"Microsoft Azure"}],"headline":"Transitioning big data workloads to the cloud: Best practices from Unravel Data","datePublished":"2019-01-31T00:00:00+00:00","dateModified":"2025-06-12T12:29:21+00:00","mainEntityOfPage":{"@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/transitioning-big-data-workloads-to-the-cloud-best-practices-from-unravel-data\/"},"wordCount":1115,"commentCount":0,"publisher":{"@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/#organization"},"keywords":["Big Data"],"articleSection":["Analytics"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/azure.microsoft.com\/en-us\/blog\/transitioning-big-data-workloads-to-the-cloud-best-practices-from-unravel-data\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/transitioning-big-data-workloads-to-the-cloud-best-practices-from-unravel-data\/","url":"https:\/\/azure.microsoft.com\/en-us\/blog\/transitioning-big-data-workloads-to-the-cloud-best-practices-from-unravel-data\/","name":"Transitioning big data workloads to the cloud: Best practices from Unravel Data | Microsoft Azure Blog","isPartOf":{"@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/#website"},"datePublished":"2019-01-31T00:00:00+00:00","dateModified":"2025-06-12T12:29:21+00:00","description":"Migrating on-premises Apache Hadoop\u00ae and Spark workloads to the cloud remains a key priority for many organizations. In my last post, I shared \u201cTips and tricks for migrating on-premises Hadoop infrastructure to Azure HDInsight.\u201d","breadcrumb":{"@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/transitioning-big-data-workloads-to-the-cloud-best-practices-from-unravel-data\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/azure.microsoft.com\/en-us\/blog\/transitioning-big-data-workloads-to-the-cloud-best-practices-from-unravel-data\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/transitioning-big-data-workloads-to-the-cloud-best-practices-from-unravel-data\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Blog home","item":"https:\/\/azure.microsoft.com\/en-us\/blog\/"},{"@type":"ListItem","position":2,"name":"Analytics","item":"https:\/\/azure.microsoft.com\/en-us\/blog\/category\/analytics\/"},{"@type":"ListItem","position":3,"name":"Transitioning big data workloads to the cloud: Best practices from Unravel Data"}]},{"@type":"WebSite","@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/#website","url":"https:\/\/azure.microsoft.com\/en-us\/blog\/","name":"Microsoft Azure Blog","description":"Get the latest Azure news, updates, and announcements from the Azure blog. From product updates to hot topics, hear from the Azure experts.","publisher":{"@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/azure.microsoft.com\/en-us\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/#organization","name":"Microsoft Azure Blog","url":"https:\/\/azure.microsoft.com\/en-us\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2024\/06\/microsoft_logo.webp","contentUrl":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2024\/06\/microsoft_logo.webp","width":512,"height":512,"caption":"Microsoft Azure Blog"},"image":{"@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/microsoftazure","https:\/\/x.com\/azure","https:\/\/www.instagram.com\/microsoftdeveloper\/","https:\/\/www.linkedin.com\/company\/16188386","https:\/\/www.youtube.com\/user\/windowsazure"]},{"@type":"Person","@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/#\/schema\/person\/c702e5edd662b328b49b7e1180cab117","name":"shakir","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/9342c7c05bb16548741bc5cd3a3e3b7ee0c8e746844ad2cc582db5beb5514c6f?s=96&d=mm&r=g7664e653ea371ce16eaf75e9fa8952c4","url":"https:\/\/secure.gravatar.com\/avatar\/9342c7c05bb16548741bc5cd3a3e3b7ee0c8e746844ad2cc582db5beb5514c6f?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/9342c7c05bb16548741bc5cd3a3e3b7ee0c8e746844ad2cc582db5beb5514c6f?s=96&d=mm&r=g","caption":"shakir"},"sameAs":["https:\/\/azure.microsoft.com"],"url":"https:\/\/azure.microsoft.com\/en-us\/blog\/author\/shakir\/"}]}},"msxcm_display_generated_audio":false,"msxcm_animated_featured_image":null,"distributor_meta":false,"distributor_terms":false,"distributor_media":false,"distributor_original_site_name":"Microsoft Azure Blog","distributor_original_site_url":"https:\/\/azure.microsoft.com\/en-us\/blog","push-errors":false,"_links":{"self":[{"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/posts\/1737","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/comments?post=1737"}],"version-history":[{"count":1,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/posts\/1737\/revisions"}],"predecessor-version":[{"id":41606,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/posts\/1737\/revisions\/41606"}],"wp:attachment":[{"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/media?parent=1737"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/categories?post=1737"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/tags?post=1737"},{"taxonomy":"audience","embeddable":true,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/audience?post=1737"},{"taxonomy":"content-type","embeddable":true,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/content-type?post=1737"},{"taxonomy":"product","embeddable":true,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/product?post=1737"},{"taxonomy":"tech-community","embeddable":true,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/tech-community?post=1737"},{"taxonomy":"topic","embeddable":true,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/topic?post=1737"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/coauthors?post=1737"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}