The COVID-19 Data Lake contains COVID-19 related datasets from various sources, covering testing and patient outcome tracking data, social distancing policy, hospital capacity, mobility, etc.
The COVID-19 Data Lake is hosted in Azure Data Lake Storage in the East US region. For each dataset, modified versions in csv, json, json-lines, and parquet formats are available, as well as the raw data as ingested.
ISO 3166 subdivision codes are added where not present to simplify joining, and column names reformatted in lower case with underscore separators. Datasets are typically updated daily, with historical copies of modified and raw files also available.
USE OF DATASETS IS SUBJECT TO TERMS AND CONDITIONS SET BY THE DATASET OWNERS. SEE THE DETAILS PAGE FOR EACH DATASET FOR APPLICABLE TERMS AND CONDITIONS.
Datasets | Description |
---|---|
Bing COVID-19 Data | Bing COVID-19 data includes confirmed, fatal, and recovered cases from all regions, updated daily. |
COVID Tracking Project | The COVID Tracking Project dataset provides the latest numbers on tests, confirmed cases, hospitalizations, and patient outcomes from every US state and territory. |
European Centre for Disease Prevention and Control (ECDC) Covid-19 Cases | The latest available public data on geographic distribution of COVID-19 cases worldwide from the European Center for Disease Prevention and Control (ECDC). Each row/entry contains the number of new cases reported per day and per country or region. |
Oxford COVID-19 Government Response Tracker | The Oxford Covid-19 Government Response Tracker (OxCGRT) dataset contains systematic information on which governments have taken which measures, and when. |