To help you leverage your telemetry data and better monitor the behavior of your Azure applications, we are happy to provide a Jupyter Notebook template that extends the power of Application Insights. Instead of making ad hoc queries in the Application Insights portal when an issue arises, you can now write a Jupyter Notebook that routinely queries for telemetry data, performs advanced analytics, and sends the derived data back to Application Insights for monitoring and alerting. You can execute the Jupyter Notebook using Azure WebJob either on a schedule or via webhook.
Through this approach, you can manipulate and analyze your telemetry data beyond the constraints of query language or limit. You can take advantage of the existing alerting system to monitor the newly derived data, rather than raw instrumentation data. The derived data can also be correlated with other metrics for root cause analysis, used to train machine learning models, and much more. In this blog post, you will find a step-by-step guide for operationalizing this template to perform advanced analytics on your telemetry data, as well as an example implementation.
Create a Jupyter Notebook
Create a new Notebook or clone the template. While Jupyter supports various programming languages, this blog post focuses on performing advanced analytics in Python 2.7.
Query for telemetry data from Application Insights
To query for telemetry data from an Application Insights resource, the Application ID and an API Key are needed. Both can be found in Application Insights portal, on the API Access blade and under Configure.
!pip install --upgrade applicationinsights-jupyter from applicationinsights_jupyter import Jupyter API_URL = "https://api.aimon.applicationinsights.io/" APP_ID = "REDACTED" API_KEY = "REDACTED" QUERY_STRING = "customEvents\ | where timestamp >= ago(10m) and timestamp < ago(5m)\ | where name == 'NodeProcessStarted'\ | summarize pids=makeset(tostring(customDimensions.PID)) by cloud_RoleName, cloud_RoleInstance, bin(timestamp, 1m)" jupyterObj = Jupyter(APP_ID, API_KEY, API_URL) jupyterObjData = jupyterObj.getAIData(QUERY_STRING)
Get more information by accessing the API.
Send derived data back to Application Insights
To send data to an Application Insights resource, the Instrumentation Key is needed. It can be found in Application Insights portal, on the Overview blade.
!pip install applicationinsights from applicationinsights import TelemetryClient IKEY = "REDACTED" tc = TelemetryClient(IKEY) tc.track_metric("crashCount", 1) tc.flush()
Get more information by accessing the API.
Execute the Notebook using Azure WebJob
To execute the Notebook using Azure WebJob, the Notebook, its dependencies, and the Jupyter server need to be uploaded onto an Azure App Service container.
Prepare the necessary resources
- Download the Notebook onto your machine.
- Install the Jupyter server using Anaconda.
- Execute the Notebook on your machine to install all dependencies, as App Service container does not allow changes to the directories where the modules would otherwise be installed automatically.
- Update the path in a dependency to reflect App Service container’s directory. Replace the first script in Anaconda2/Scripts/jupyter-nbconvert-script.py with
- Update the local copy of the Notebook, excluding pip commands.
- Create run.cmd file containing the following script
D:\home\site\wwwroot\App_Data\resources\Anaconda2\Scripts\jupyter nbconvert --execute <Your Notebook Name>.ipynb
- Obtain deployment credentials and FTP connection information.
- FTP the Anaconda2 folder to a new directory in App Service container
Operationalize the Notebook
- Create a new Azure WebJob and upload the Notebook and run.cmd file.
An example implementation
We operationalized this template and have been performing advanced analytics on telemetry data of one of our own services.
Our service runs four Node.js processes on each cloud instance. From root cause analysis, we have noticed cases of Node.js crashes. However, due to limitations of the SDK, we cannot log when the crash occurs. So, we created a Jupyter Notebook to analyze the existing telemetry data to detect Node.js crashes.
A custom event NodeProcessStarted is logged when a new Node.js process starts in a cloud instance. Normally, all four processes start nearly simultaneously when they are recycled every 8-11 hours. So, when we see less than four NodeProcessStarted events occur at a different frequency, we can infer that new process(es) started to replace recently crashed process(es).
In this implemented template, you will see how we query for telemetry data, analyze the data, query for more telemetry data to enrich the analysis, and then send the derived data back to Application Insights.
We hope this template helps you derive actionable insights from telemetry data and better manage your Azure applications.