Powerful Debugging Tools for Spark for Azure HDInsight

By Jenny Jiang Principal Program Manager, Big Data Team

Powerful Debugging Tools for Spark for Azure HDInsight • 3 min read

Posted on September 6, 2018
3 min read

Microsoft runs one of the largest big data clusters in the world, internally called Cosmos. This runs millions of jobs across hundreds of thousands of servers over multiple Exabytes of data. Being able to run and manage jobs of this scale by developers was a huge challenge. Jobs with hundreds of thousands of vertices are common and to quickly figure out why a job runs slow or narrow down bottlenecks was a huge challenge. We built powerful tools that graphically show the entire job graph including the various vertex execution times, playback, etc. which helped developers greatly. While this was built for our internal language in Cosmos (called Scope), we are working very hard to bring this power to all Spark developers.

Today, we are delighted to announce the public preview of the Apache Spark Debugging Toolset for HDInsight for Spark 2.3 cluster and forward. The default Spark history server user experience is now enhanced in HDInsight with rich information on your spark jobs with powerful interactive visualization of Job Graphs and Data Flows. The new features greatly assist HDInsight Spark developers in job data management, data sampling, job monitoring, and job diagnosis.

Spark History Server enhancements

The Spark History Server Experience in HDInsight now features two new tabs: Graph and Data.

Graph Tab: Job graph is a powerful interactive visualization of your jobs. This interface enables innovative debugging experiences such as playback and heatmap by the progress of job stages read and written for Spark application and individual jobs.

The Spark job graph displays Spark job executions details with data input and output across stages. For completed jobs, the Spark job graph allows Spark developer to playback the job by progress, data read and written with details. You can now dwell in Spark job diagnosis around performance, data and execution time using this experience which articulates various stage outliers.

Data Tab: Job-specific input, output data view, search, download, preview, data copy, data URL copy, data export to CSV, as well as table operations view are visualized in the data tab.

As a developer or data scientist, you can perform various actions such as preview, download, copy, and export to CSV file of the data. You can also come here to partially download data as sample data for your local run and local debug. Metadata interpretation and correlation has always been a challenge in debugging. A cool feature has also been added around Table Operations, you are able to view the Hive metadata, investigate table operations at each stage to gain more insights for better troubleshooting and spark job analysis.

Developer nirvana

HDInsight Spark developers can greatly increase their productivity by leveraging these capabilities:

Preview and download Spark job input and output data, as well as view Spark job table operations.
View and playback Spark application/job graph by progress, data read and written.
Identify the Spark application execution dependencies among stages and jobs for performance tuning.
View data read/write heatmap, identify the outliers and the bottlenecking stage and job for Spark job performance diagnosis.
View Spark job/stage data input/output size and time duration for performance optimization.
Locate failed stage and drill down for failed tasks details for debugging.

Getting started with Apache Spark Debugging Toolset

These features have been built into HDInsight Spark history server.

Access from the Azure portal. Open the Spark cluster, click Cluster Dashboard from Quick Links, and then click Spark History Server.

Access by URL, open the Spark History Server.

More features to come

Critical path analysis for Spark application and job
Spark job diagnosis
- Data Skew and Time Skew Analysis
- Executor Usage Analysis
Debugging on failed job

Feedback

We look forward to your comments and feedback. If there is any feature request, customer ask, or suggestion, please send us a note to hdivstool@microsoft.com. For bug submission, please open a new ticket using the template.

For more information, check out the following:

Learn more about today’s announcements on the Azure blog and Big Data blog, and discover more Azure service updates.

Powerful Debugging Tools for Spark for Azure HDInsight

Spark History Server enhancements

Developer nirvana

Getting started with Apache Spark Debugging Toolset

More features to come

Feedback

Explore

Related posts

Enabling Diagnostic Logging in Azure API for FHIR®

Azure におけるインフラから SAP アプリケーションレイヤーまでの IRAP Protected コンプライアンス

MileIQ and Azure Event Hubs: Billions of miles streamed

Azure Stack IaaS – part ten

Join the conversation

おすすめ

AI + machine learning

分析

コンピューティング

コンテナー

データベース

DevOps

開発者ツール

ハイブリッド + マルチクラウド

ID

統合

モノのインターネット (IoT)

管理とガバナンス

メディア

移行

複合現実

モバイル

ネットワーク

セキュリティ

ストレージ

Web

Windows Virtual Desktop

ユース ケース

アプリケーション開発

AI

クラウドの移行とモダン化

データと分析

ハイブリッド クラウドとインフラストラクチャ

モノのインターネット (IoT)

セキュリティとガバナンス

組織の種類

リソース

Spark History Server enhancements

Developer nirvana

Getting started with Apache Spark Debugging Toolset

More features to come

Feedback

Explore

Related posts

Join the conversation

ユースケース

ハイブリッドクラウドとインフラストラクチャ