{"id":1691,"date":"2019-02-11T00:00:00","date_gmt":"2019-02-11T00:00:00","guid":{"rendered":"https:\/\/azure.microsoft.com\/blog\/controlling-costs-in-azure-data-explorer-using-down-sampling-and-aggregation"},"modified":"2025-06-17T00:29:48","modified_gmt":"2025-06-17T07:29:48","slug":"controlling-costs-in-azure-data-explorer-using-down-sampling-and-aggregation","status":"publish","type":"post","link":"https:\/\/azure.microsoft.com\/en-us\/blog\/controlling-costs-in-azure-data-explorer-using-down-sampling-and-aggregation\/","title":{"rendered":"Controlling costs in Azure Data Explorer using down-sampling and aggregation"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">Azure Data Explorer (ADX) is an outstanding service for continuous ingestion and storage of high velocity telemetry data from cloud services and IoT devices. Leveraging its first-rate performance for querying billions of records, the telemetry data can be further analyzed for various insights such as monitoring service health, production processes, and usage trends. Depending on data velocity and retention policy, data size can rapidly scale to petabytes of data and increase the costs associated with data storage. A common solution for storage of large datasets for a long period of time is to store the data with differing resolution. The most recent data is stored at maximum resolution, meaning all events are stored in raw format. While the historic data is stored at reduced resolution, being filtered and\/or aggregated. This solution is often used for time series databases to control hot storage costs.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In this blog, I\u2019ll use the GitHub events public dataset as the playground. For more information read about how to stream GitHub events into your own ADX cluster by reading the blog, \u201c<a href=\"https:\/\/medium.com\/microsoftazure\/exploring-github-events-with-azure-data-explorer-69f28eb705b9\" target=\"_blank\" rel=\"noreferrer noopener\">Exploring GitHub events with Azure Data Explorer<\/a>.\u201d I\u2019ll describe how ADX users can take advantage of stored functions, the \u201c.set-or-append\u201d command, and the Microsoft Flow Azure Kusto connector. This will help you to create and update tables with filtered, down-sampled, and aggregated data for controlling storage costs. The following are steps which I performed.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"create-a-function-for-down-sampling-and-aggregation\">Create a function for down-sampling and aggregation<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The ADX demo11 cluster contains a database named GitHub. Since 2016, all events from <a href=\"https:\/\/www.gharchive.org\/\" target=\"_blank\" rel=\"noreferrer noopener\">GHArchive<\/a> have been ingested into the <strong>GitHubEvent <\/strong>table and now total more than 1 billion records. Each GitHub event is represented in a single record with event-related information on the repository, author, comments, and more.<\/p>\n\n\n\n<figure class=\"wp-block-image has-custom-border\"><img decoding=\"async\" src=\"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2019\/02\/49ea1297-5fe8-4648-bf2d-d913f8ea507e.webp\" alt=\"Screenshot of Azure Data Explorer demo11 and GitHub database\" style=\"border-radius:0px\" title=\"Screenshot of Azure Data Explorer demo11 and GitHub database\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Initially, I created the stored function <strong>AggregateReposWeeklyActivity<\/strong> which counts the total number of events in every repository for a given week.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code #highlighter_6120 {   overflow: visible; \/* or remove overflow entirely *\/ }\"><pre class=\"brush: plain; auto-links: false; gutter: false; title: ; quick-code: false; notranslate\" title=\"\">\n.create-or-alter function with (folder = \"TimeSeries\", docstring = \"Aggregate Weekly Repos Activity\u201d)\nAggregateReposWeeklyActivity(StartTime:datetime)\n{\n     let PeriodStart = startofweek(StartTime);\n     let Period = 7d;\n     GithubEvent\n     | where CreatedAt between(PeriodStart .. Period)\n     | summarize EventCount=count() by RepoName = tostring(Repo.name), StartDate=startofweek(CreatedAt)\n     | extend EndDate=endofweek(StartDate)\n     | project StartDate, EndDate, RepoName, EventCount\n}\n<\/pre><\/div>\n\n\n<p class=\"wp-block-paragraph\">I can now use this function to generate a down-sampled dataset of the weekly repository activity. For example, using the <strong>AggregateReposWeeklyActivity<\/strong> function for the first week of 2017 results in a dataset of 867,115 records.<\/p>\n\n\n\n<figure class=\"wp-block-image has-custom-border\"><img decoding=\"async\" src=\"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2019\/02\/0ef99b73-b8f0-431e-bc3e-cfcaf4133ccc.webp\" alt=\"Screenshot of AggregateReposWeeklyActivity function yielding dataset results\" style=\"border-radius:0px\" title=\"Screenshot of AggregateReposWeeklyActivity function yielding dataset results\" \/><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"using-kusto-query-create-a-table-with-historic-data\">Using Kusto query, create a table with historic data<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Since the original dataset starts in 2016, I formulated a program that creates a table named <strong>ReposWeeklyActivity<\/strong> and backfills it with weekly aggregated data from the <strong>GitHubEvent<\/strong> table. The query runs in parallel ingestion of weekly aggregated datasets using the \u201c.set-or-append\u201d command. The first ingestion operation also creates the table that holds the aggregated data.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code #highlighter_6120 {   overflow: visible; \/* or remove overflow entirely *\/ }\"><pre class=\"brush: plain; auto-links: false; gutter: false; title: ; quick-code: false; notranslate\" title=\"\">\n.show table GithubEvent details\n| project TableName, SizeOnDiskGB=TotalExtentSize\/pow(1024,3), TotalRowCount\n\n.show table ReposWeeklyActivity details\n| project TableName, SizeOnDiskGB=TotalExtentSize\/pow(1024,3), TotalRowCount\n\nCode sample:\nusing Kusto.Data.Common;\nusing Kusto.Data.Net.Client;\nusing System;\nusing System.Collections.Generic;\nusing System.Linq;\nusing System.Text;\nusing System.Threading.Tasks;\n\nnamespace GitHubProcessing\n{\n     class Program\n     {\n         static void Main(string[] args)\n         {\n             var clusterUrl = \"https:\/\/demo11.westus.kusto.windows.net:443;Initial Catalog=GitHub;Fed=True\";\n             using (var queryProvider = KustoClientFactory.CreateCslAdminProvider(clusterUrl))\n             {\n                 Parallel.For(\n                     0,\n                     137,\n                     new ParallelOptions() { MaxDegreeOfParallelism = 8 },\n                     (i) =>\n                     {\n                         var startDate = new DateTime(2016, 01, 03, 0, 0, 0, 0, DateTimeKind.Utc) + TimeSpan.FromDays(7 * i);\n                         var startDateAsCsl = CslDateTimeLiteral.AsCslString(startDate);\n                         var command = $@\"\n                         .set-or-append ReposWeeklyActivity <|\n                         AggregateReposWeeklyActivity({startDateAsCsl})\";\n                         queryProvider.ExecuteControlCommand(command);\n\n                        Console.WriteLine($\"Finished: start={startDate.ToUniversalTime()}\");\n                     });\n             }\n         }\n     }\n}\n<\/pre><\/div>\n\n\n<p class=\"wp-block-paragraph\">Once the backfill is complete, the <strong>ReposWeeklyActivity <\/strong>table will contain 153 million records.<\/p>\n\n\n\n<figure class=\"wp-block-image has-custom-border\"><img decoding=\"async\" src=\"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2019\/02\/0a2d5630-bebf-4077-a6e7-292d4c05375a.webp\" alt=\"Screenshot of the ReposWeeklyActivity table yielding 153 million records\" style=\"border-radius:0px\" title=\"Screenshot of the ReposWeeklyActivity table yielding 153 million records\" \/><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"configure-weekly-aggregation-jobs-using-microsoft-flow-and-azure-kusto-connector\">Configure weekly aggregation jobs using Microsoft Flow and Azure Kusto connector<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Once the <strong>ReposWeeklyActivity<\/strong> table is created and filled with the historic data, we want to make sure it stays updated with new data appended every week. For that purpose, I created a flow in Microsoft Flow that leverages Azure Kusto connector to ingest aggregation data on a weekly basis. The flow is built of two simple steps:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">Weekly trigger of Microsoft Flow.<\/li>\n\n\n\n<li class=\"wp-block-list-item\">Use of \u201c.set-or-append\u201d to ingest the aggregated data from the past week.<\/li>\n<\/ol>\n\n\n\n<figure class=\"wp-block-image has-custom-border\"><img decoding=\"async\" src=\"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2019\/02\/4d9f889f-c9fe-4ccf-9873-503127c6bbb6.webp\" alt=\"image\" style=\"border-radius:0px\" title=\"image\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">For additional information on using Microsoft Flow with Azure Data Explorer see the <a href=\"https:\/\/docs.microsoft.com\/en-us\/azure\/kusto\/tools\/flow\" target=\"_blank\" rel=\"noreferrer noopener\">Azure Kusto Flow connector<\/a>.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"start-saving\">Start saving<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">To depict the cost saving potential of down-sampling, I\u2019ve used \u201c.show table<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">details\u201d command to compare the size of the original <strong>GitHubEvent<\/strong> table and the down-sampled table <strong>ReposWeeklyActivity<\/strong>.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code #highlighter_6120 {   overflow: visible; \/* or remove overflow entirely *\/ }\"><pre class=\"brush: plain; auto-links: false; gutter: false; title: ; quick-code: false; notranslate\" title=\"\">\n.show table GithubEvent details\n| project TableName, SizeOnDiskGB=TotalExtentSize\/pow(1024,3), TotalRowCount\n\n.show table ReposWeeklyActivity details\n| project TableName, SizeOnDiskGB=TotalExtentSize\/pow(1024,3), TotalRowCount\n<\/pre><\/div>\n\n\n<p class=\"wp-block-paragraph\">The results, summarized in the table below, show that for the same time frame the down-sampled data is approximately 10 times smaller in record count and approximately 180 times smaller in storage size.<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td>&nbsp;<\/td><td><strong>Original data<\/strong><\/td><td><strong>Down-sampled\/aggregated data<\/strong><\/td><\/tr><tr><td><strong>Time span<\/strong><\/td><td>2016-01-01 \u2026 2018-09-26<\/td><td>2016-01-01 \u2026 2018-09-26<\/td><\/tr><tr><td><strong>Record count<\/strong><\/td><td>1,048,961,967<\/td><td>153,234,107<\/td><\/tr><tr><td><strong>Total size on disk (indexed and compressed)<\/strong><\/td><td>725.2 GB<\/td><td>4.38 GB<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Converting the cost savings potential to real savings can be performed in various ways. A combination of the different methods are usually most efficient in controlling costs.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\"><strong>Control cluster size and hot storage costs<\/strong>: Set different caching policies for the original data table and down-sampled table. For example, 30 days caching for the original data and two years for the down-sampled table. This configuration allows you to enjoy ADX first-rate performance for interactive exploration of raw data, and analyze activity trends over years. All while controlling cluster size and hot storage costs.<\/li>\n\n\n\n<li class=\"wp-block-list-item\"><strong>Control cold storage costs<\/strong>: Set different retention policies for the original data table and down-sampled table. For example, 30 days retention for the original data and two years for the down-sampled table. This configuration allows you to explore the raw data and analyze activity trends over years while controlling cold storage costs. On a different note, this configuration is also common for meeting privacy requirements as the raw data might contain user-identifiable information and the aggregated data is usually anonymous.<\/li>\n\n\n\n<li class=\"wp-block-list-item\"><strong>Use the down-sampled table for analysis<\/strong>: Running queries on the down-sampled table for time series trend analysis will consume less CPU and memory resources. In the example below, I compare the resource consumption of a typical query that calculates the total weekly activity across all repositories. The query statistics shows that analyzing weekly activity trends on the down-sampled dataset is approximately 17 times more efficient in CPU consumption and approximately eight times more efficient in memory consumption.<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">Running this query on the original <strong>GitHubEvent<\/strong> table consumes approximately 56 seconds of total CPU time and 176MB of memory.<img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2019\/02\/bd1aff94-f23c-4208-a48d-0f80b9ebcc40.webp\" alt=\"Screenshot of a command comparing GitHubEvent and ReposWeeklyActivity table sizes\" width=\"1850\" height=\"612\">The same calculation on the aggregated <strong>ReposWeeklyActivity <\/strong>table consumes only about three seconds of total CPU time and 16MB of memory.<img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2019\/02\/df02271c-e8d3-4a46-baee-f44eb44f3246.webp\" alt=\"Screenshot showing CPU time and MB of memory being used by demo11 query\" width=\"1871\" height=\"612\">Next stepsAzure Data Explorer leverages cloud elasticity to scale out to petabyte-size data, depict exceptional performance, and handle high query workloads. In this blog, I\u2019ve described how to implement down-sampling and aggregation to control the costs associated with large datasets.To find out more about Azure Data Explorer you can:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\"><a href=\"https:\/\/azure.microsoft.com\/services\/data-explorer\/\" target=\"_blank\" rel=\"noreferrer noopener\">Try Azure Data Explorer<\/a> in preview now.<\/li>\n\n\n\n<li class=\"wp-block-list-item\"><a href=\"https:\/\/azure.microsoft.com\/pricing\/details\/data-explorer\" target=\"_blank\" rel=\"noreferrer noopener\">Find pricing information<\/a> for Azure Data Explorer.<\/li>\n\n\n\n<li class=\"wp-block-list-item\"><a href=\"https:\/\/docs.microsoft.com\/en-us\/azure\/data-explorer\/\" target=\"_blank\" rel=\"noreferrer noopener\">Access documentation<\/a> for Azure Data Explorer.<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Azure Data Explorer (ADX) is an outstanding service for continuous ingestion and storage of high velocity telemetry data from cloud services and IoT devices.<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"ep_exclude_from_search":false,"_classifai_error":"","_classifai_text_to_speech_error":"","_alt_title":"","ms-ems-related-posts":[],"footnotes":"","azure_community_cta_settings":[]},"categories":[1474],"tags":[48],"audience":[3054,3057,3053],"content-type":[1511],"product":[1522],"tech-community":[],"coauthors":[647],"class_list":["post-1691","post","type-post","status-publish","format-standard","hentry","category-analytics","tag-big-data","audience-business-decision-makers","audience-data-professionals","audience-it-decision-makers","content-type-best-practices","product-azure-data-explorer","review-flag-1680286581-295","review-flag-1680286581-56","review-flag-1680286581-364","review-flag-1680286584-658","review-flag-1-1680286581-825","review-flag-2-1680286581-601","review-flag-3-1680286581-173","review-flag-4-1680286581-250","review-flag-7-1680286581-146","review-flag-8-1680286581-263","review-flag-and-o-1680286581-349","review-flag-iot-1680286585-835","review-flag-new-1680286579-546"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.4 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Controlling costs in Azure Data Explorer using down-sampling and aggregation | Microsoft Azure Blog<\/title>\n<meta name=\"description\" content=\"Azure Data Explorer (ADX) is an outstanding service for continuous ingestion and storage of high velocity telemetry data from cloud services and IoT devices.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/azure.microsoft.com\/en-us\/blog\/controlling-costs-in-azure-data-explorer-using-down-sampling-and-aggregation\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Controlling costs in Azure Data Explorer using down-sampling and aggregation | Microsoft Azure Blog\" \/>\n<meta property=\"og:description\" content=\"Azure Data Explorer (ADX) is an outstanding service for continuous ingestion and storage of high velocity telemetry data from cloud services and IoT devices.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/azure.microsoft.com\/en-us\/blog\/controlling-costs-in-azure-data-explorer-using-down-sampling-and-aggregation\/\" \/>\n<meta property=\"og:site_name\" content=\"Microsoft Azure Blog\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/microsoftazure\" \/>\n<meta property=\"article:published_time\" content=\"2019-02-11T00:00:00+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-06-17T07:29:48+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2019\/02\/49ea1297-5fe8-4648-bf2d-d913f8ea507e.webp\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@azure\" \/>\n<meta name=\"twitter:site\" content=\"@azure\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Oded Sacher\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"6 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/azure.microsoft.com\\\/en-us\\\/blog\\\/controlling-costs-in-azure-data-explorer-using-down-sampling-and-aggregation\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/azure.microsoft.com\\\/en-us\\\/blog\\\/controlling-costs-in-azure-data-explorer-using-down-sampling-and-aggregation\\\/\"},\"author\":[{\"@id\":\"https:\\\/\\\/azure.microsoft.com\\\/en-us\\\/blog\\\/author\\\/oded-sacher\\\/\",\"@type\":\"Person\",\"@name\":\"Oded Sacher\"}],\"headline\":\"Controlling costs in Azure Data Explorer using down-sampling and aggregation\",\"datePublished\":\"2019-02-11T00:00:00+00:00\",\"dateModified\":\"2025-06-17T07:29:48+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/azure.microsoft.com\\\/en-us\\\/blog\\\/controlling-costs-in-azure-data-explorer-using-down-sampling-and-aggregation\\\/\"},\"wordCount\":973,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\\\/\\\/azure.microsoft.com\\\/en-us\\\/blog\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/azure.microsoft.com\\\/en-us\\\/blog\\\/controlling-costs-in-azure-data-explorer-using-down-sampling-and-aggregation\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/azure.microsoft.com\\\/en-us\\\/blog\\\/wp-content\\\/uploads\\\/2019\\\/02\\\/49ea1297-5fe8-4648-bf2d-d913f8ea507e.webp\",\"keywords\":[\"Big Data\"],\"articleSection\":[\"Analytics\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/azure.microsoft.com\\\/en-us\\\/blog\\\/controlling-costs-in-azure-data-explorer-using-down-sampling-and-aggregation\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/azure.microsoft.com\\\/en-us\\\/blog\\\/controlling-costs-in-azure-data-explorer-using-down-sampling-and-aggregation\\\/\",\"url\":\"https:\\\/\\\/azure.microsoft.com\\\/en-us\\\/blog\\\/controlling-costs-in-azure-data-explorer-using-down-sampling-and-aggregation\\\/\",\"name\":\"Controlling costs in Azure Data Explorer using down-sampling and aggregation | Microsoft Azure Blog\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/azure.microsoft.com\\\/en-us\\\/blog\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/azure.microsoft.com\\\/en-us\\\/blog\\\/controlling-costs-in-azure-data-explorer-using-down-sampling-and-aggregation\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/azure.microsoft.com\\\/en-us\\\/blog\\\/controlling-costs-in-azure-data-explorer-using-down-sampling-and-aggregation\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/azure.microsoft.com\\\/en-us\\\/blog\\\/wp-content\\\/uploads\\\/2019\\\/02\\\/49ea1297-5fe8-4648-bf2d-d913f8ea507e.webp\",\"datePublished\":\"2019-02-11T00:00:00+00:00\",\"dateModified\":\"2025-06-17T07:29:48+00:00\",\"description\":\"Azure Data Explorer (ADX) is an outstanding service for continuous ingestion and storage of high velocity telemetry data from cloud services and IoT devices.\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/azure.microsoft.com\\\/en-us\\\/blog\\\/controlling-costs-in-azure-data-explorer-using-down-sampling-and-aggregation\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/azure.microsoft.com\\\/en-us\\\/blog\\\/controlling-costs-in-azure-data-explorer-using-down-sampling-and-aggregation\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/azure.microsoft.com\\\/en-us\\\/blog\\\/controlling-costs-in-azure-data-explorer-using-down-sampling-and-aggregation\\\/#primaryimage\",\"url\":\"https:\\\/\\\/azure.microsoft.com\\\/en-us\\\/blog\\\/wp-content\\\/uploads\\\/2019\\\/02\\\/49ea1297-5fe8-4648-bf2d-d913f8ea507e.webp\",\"contentUrl\":\"https:\\\/\\\/azure.microsoft.com\\\/en-us\\\/blog\\\/wp-content\\\/uploads\\\/2019\\\/02\\\/49ea1297-5fe8-4648-bf2d-d913f8ea507e.webp\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/azure.microsoft.com\\\/en-us\\\/blog\\\/controlling-costs-in-azure-data-explorer-using-down-sampling-and-aggregation\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Blog home\",\"item\":\"https:\\\/\\\/azure.microsoft.com\\\/en-us\\\/blog\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Analytics\",\"item\":\"https:\\\/\\\/azure.microsoft.com\\\/en-us\\\/blog\\\/category\\\/analytics\\\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"Controlling costs in Azure Data Explorer using down-sampling and aggregation\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/azure.microsoft.com\\\/en-us\\\/blog\\\/#website\",\"url\":\"https:\\\/\\\/azure.microsoft.com\\\/en-us\\\/blog\\\/\",\"name\":\"Microsoft Azure Blog\",\"description\":\"Get the latest Azure news, updates, and announcements from the Azure blog. From product updates to hot topics, hear from the Azure experts.\",\"publisher\":{\"@id\":\"https:\\\/\\\/azure.microsoft.com\\\/en-us\\\/blog\\\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/azure.microsoft.com\\\/en-us\\\/blog\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/azure.microsoft.com\\\/en-us\\\/blog\\\/#organization\",\"name\":\"Microsoft Azure Blog\",\"url\":\"https:\\\/\\\/azure.microsoft.com\\\/en-us\\\/blog\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/azure.microsoft.com\\\/en-us\\\/blog\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/azure.microsoft.com\\\/en-us\\\/blog\\\/wp-content\\\/uploads\\\/2024\\\/06\\\/microsoft_logo.webp\",\"contentUrl\":\"https:\\\/\\\/azure.microsoft.com\\\/en-us\\\/blog\\\/wp-content\\\/uploads\\\/2024\\\/06\\\/microsoft_logo.webp\",\"width\":512,\"height\":512,\"caption\":\"Microsoft Azure Blog\"},\"image\":{\"@id\":\"https:\\\/\\\/azure.microsoft.com\\\/en-us\\\/blog\\\/#\\\/schema\\\/logo\\\/image\\\/\"},\"sameAs\":[\"https:\\\/\\\/www.facebook.com\\\/microsoftazure\",\"https:\\\/\\\/x.com\\\/azure\",\"https:\\\/\\\/www.instagram.com\\\/microsoftdeveloper\\\/\",\"https:\\\/\\\/www.linkedin.com\\\/company\\\/16188386\",\"https:\\\/\\\/www.youtube.com\\\/user\\\/windowsazure\"]},{\"@type\":\"Person\",\"@id\":\"\",\"url\":\"https:\\\/\\\/azure.microsoft.com\\\/en-us\\\/blog\\\/author\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Controlling costs in Azure Data Explorer using down-sampling and aggregation | Microsoft Azure Blog","description":"Azure Data Explorer (ADX) is an outstanding service for continuous ingestion and storage of high velocity telemetry data from cloud services and IoT devices.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/azure.microsoft.com\/en-us\/blog\/controlling-costs-in-azure-data-explorer-using-down-sampling-and-aggregation\/","og_locale":"en_US","og_type":"article","og_title":"Controlling costs in Azure Data Explorer using down-sampling and aggregation | Microsoft Azure Blog","og_description":"Azure Data Explorer (ADX) is an outstanding service for continuous ingestion and storage of high velocity telemetry data from cloud services and IoT devices.","og_url":"https:\/\/azure.microsoft.com\/en-us\/blog\/controlling-costs-in-azure-data-explorer-using-down-sampling-and-aggregation\/","og_site_name":"Microsoft Azure Blog","article_publisher":"https:\/\/www.facebook.com\/microsoftazure","article_published_time":"2019-02-11T00:00:00+00:00","article_modified_time":"2025-06-17T07:29:48+00:00","og_image":[{"url":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2019\/02\/49ea1297-5fe8-4648-bf2d-d913f8ea507e.webp","type":"","width":"","height":""}],"twitter_card":"summary_large_image","twitter_creator":"@azure","twitter_site":"@azure","twitter_misc":{"Written by":"Oded Sacher","Est. reading time":"6 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/controlling-costs-in-azure-data-explorer-using-down-sampling-and-aggregation\/#article","isPartOf":{"@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/controlling-costs-in-azure-data-explorer-using-down-sampling-and-aggregation\/"},"author":[{"@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/author\/oded-sacher\/","@type":"Person","@name":"Oded Sacher"}],"headline":"Controlling costs in Azure Data Explorer using down-sampling and aggregation","datePublished":"2019-02-11T00:00:00+00:00","dateModified":"2025-06-17T07:29:48+00:00","mainEntityOfPage":{"@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/controlling-costs-in-azure-data-explorer-using-down-sampling-and-aggregation\/"},"wordCount":973,"commentCount":0,"publisher":{"@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/#organization"},"image":{"@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/controlling-costs-in-azure-data-explorer-using-down-sampling-and-aggregation\/#primaryimage"},"thumbnailUrl":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2019\/02\/49ea1297-5fe8-4648-bf2d-d913f8ea507e.webp","keywords":["Big Data"],"articleSection":["Analytics"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/azure.microsoft.com\/en-us\/blog\/controlling-costs-in-azure-data-explorer-using-down-sampling-and-aggregation\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/controlling-costs-in-azure-data-explorer-using-down-sampling-and-aggregation\/","url":"https:\/\/azure.microsoft.com\/en-us\/blog\/controlling-costs-in-azure-data-explorer-using-down-sampling-and-aggregation\/","name":"Controlling costs in Azure Data Explorer using down-sampling and aggregation | Microsoft Azure Blog","isPartOf":{"@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/controlling-costs-in-azure-data-explorer-using-down-sampling-and-aggregation\/#primaryimage"},"image":{"@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/controlling-costs-in-azure-data-explorer-using-down-sampling-and-aggregation\/#primaryimage"},"thumbnailUrl":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2019\/02\/49ea1297-5fe8-4648-bf2d-d913f8ea507e.webp","datePublished":"2019-02-11T00:00:00+00:00","dateModified":"2025-06-17T07:29:48+00:00","description":"Azure Data Explorer (ADX) is an outstanding service for continuous ingestion and storage of high velocity telemetry data from cloud services and IoT devices.","breadcrumb":{"@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/controlling-costs-in-azure-data-explorer-using-down-sampling-and-aggregation\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/azure.microsoft.com\/en-us\/blog\/controlling-costs-in-azure-data-explorer-using-down-sampling-and-aggregation\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/controlling-costs-in-azure-data-explorer-using-down-sampling-and-aggregation\/#primaryimage","url":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2019\/02\/49ea1297-5fe8-4648-bf2d-d913f8ea507e.webp","contentUrl":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2019\/02\/49ea1297-5fe8-4648-bf2d-d913f8ea507e.webp"},{"@type":"BreadcrumbList","@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/controlling-costs-in-azure-data-explorer-using-down-sampling-and-aggregation\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Blog home","item":"https:\/\/azure.microsoft.com\/en-us\/blog\/"},{"@type":"ListItem","position":2,"name":"Analytics","item":"https:\/\/azure.microsoft.com\/en-us\/blog\/category\/analytics\/"},{"@type":"ListItem","position":3,"name":"Controlling costs in Azure Data Explorer using down-sampling and aggregation"}]},{"@type":"WebSite","@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/#website","url":"https:\/\/azure.microsoft.com\/en-us\/blog\/","name":"Microsoft Azure Blog","description":"Get the latest Azure news, updates, and announcements from the Azure blog. From product updates to hot topics, hear from the Azure experts.","publisher":{"@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/azure.microsoft.com\/en-us\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/#organization","name":"Microsoft Azure Blog","url":"https:\/\/azure.microsoft.com\/en-us\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2024\/06\/microsoft_logo.webp","contentUrl":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2024\/06\/microsoft_logo.webp","width":512,"height":512,"caption":"Microsoft Azure Blog"},"image":{"@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/microsoftazure","https:\/\/x.com\/azure","https:\/\/www.instagram.com\/microsoftdeveloper\/","https:\/\/www.linkedin.com\/company\/16188386","https:\/\/www.youtube.com\/user\/windowsazure"]},{"@type":"Person","@id":"","url":"https:\/\/azure.microsoft.com\/en-us\/blog\/author\/"}]}},"bloginabox_animated_featured_image":null,"bloginabox_display_generated_audio":false,"distributor_meta":false,"distributor_terms":false,"distributor_media":false,"distributor_original_site_name":"Microsoft Azure Blog","distributor_original_site_url":"https:\/\/azure.microsoft.com\/en-us\/blog","push-errors":false,"_links":{"self":[{"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/posts\/1691","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/comments?post=1691"}],"version-history":[{"count":4,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/posts\/1691\/revisions"}],"predecessor-version":[{"id":42007,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/posts\/1691\/revisions\/42007"}],"wp:attachment":[{"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/media?parent=1691"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/categories?post=1691"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/tags?post=1691"},{"taxonomy":"audience","embeddable":true,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/audience?post=1691"},{"taxonomy":"content-type","embeddable":true,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/content-type?post=1691"},{"taxonomy":"product","embeddable":true,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/product?post=1691"},{"taxonomy":"tech-community","embeddable":true,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/tech-community?post=1691"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/coauthors?post=1691"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}