Skip Navigation

COVID-19 Data Lake

Oxford COVID-19 Government Response Tracker

COVID-19 Pandemic Data Lake Oxford Policy

The Oxford Covid-19 Government Response Tracker (OxCGRT) dataset contains systematic information on which governments have taken which measures, and when.

This information can help decision-makers and citizens understand governmental responses in a consistent way, aiding efforts to fight the pandemic. The OxCGRT systematically collects information on several different common policy responses governments have taken, records these policies on a scale to reflect the extent of government action, and aggregates these scores into a suite of policy indices.

For more information about this dataset, see here.

Datasets:
Modified versions of the dataset are available in CSV, JSON, JSON-Lines, and Parquet, updated daily:
https://pandemicdatalake.blob.core.windows.net/public/curated/covid-19/covid_policy_tracker/latest/covid_policy_tracker.csv
https://pandemicdatalake.blob.core.windows.net/public/curated/covid-19/covid_policy_tracker/latest/covid_policy_tracker.json
https://pandemicdatalake.blob.core.windows.net/public/curated/covid-19/covid_policy_tracker/latest/covid_policy_tracker.jsonl
https://pandemicdatalake.blob.core.windows.net/public/curated/covid-19/covid_policy_tracker/latest/covid_policy_tracker.parquet

All modified versions have iso_country codes and load times added, and use lower case column names with underscore separators.

Raw data:
https://pandemicdatalake.blob.core.windows.net/public/raw/covid-19/covid_policy_tracker/latest/CovidPolicyTracker.csv

Previous versions of modified and raw data:
https://pandemicdatalake.blob.core.windows.net/public/curated/covid-19/covid_policy_tracker/
https://pandemicdatalake.blob.core.windows.net/public/raw/covid-19/covid_policy_tracker/

Data Volume
As of June 8, 2020 they contained 27,919 rows (CSV 4.9 MB, JSON 20.9 MB, JSONL 20.8 MB, Parquet 133.0 KB).

Data Source
The source of this data is Thomas Hale, Sam Webster, Anna Petherick, Toby Phillips, and Beatriz Kira. (2020). Oxford COVID-19 Government Response Tracker. Blavatnik School of Government. Raw data is ingested daily from the latest OxCGRT csv file here. For more information on this dataset, including how it is collected, see here.

Data Quality
The OxCGRT does not guarantee the accuracy or timeliness of the data. Please read the data quality statement here.

License and Use Rights; Attribution
This data is licensed under the Creative Commons Attribution 4.0 International License available here.

Cite as: Thomas Hale, Sam Webster, Anna Petherick, Toby Phillips, and Beatriz Kira. (2020). Oxford COVID-19 Government Response Tracker. Blavatnik School of Government.

Contact
For any questions or feedback about this or other datasets in the COVID-19 Data Lake, please contact askcovid19dl@microsoft.com.

Notices

MICROSOFT PROVIDES AZURE OPEN DATASETS ON AN “AS IS” BASIS. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, GUARANTEES OR CONDITIONS WITH RESPECT TO YOUR USE OF THE DATASETS. TO THE EXTENT PERMITTED UNDER YOUR LOCAL LAW, MICROSOFT DISCLAIMS ALL LIABILITY FOR ANY DAMAGES OR LOSSES, INCLUDING DIRECT, CONSEQUENTIAL, SPECIAL, INDIRECT, INCIDENTAL OR PUNITIVE, RESULTING FROM YOUR USE OF THE DATASETS.

This dataset is provided under the original terms that Microsoft received source data. The dataset may include data sourced from Microsoft.

Access

Available inWhen to use
Azure Notebooks

Quickly explore the dataset with Jupyter notebooks hosted on Azure or your local machine.

Azure Databricks

Use this when you need the scale of an Azure managed Spark cluster to process the dataset.

Azure Synapse

Use this when you need the scale of an Azure managed Spark cluster to process the dataset.

Preview

countryname countrycode date c1_school_closing c2_workplace_closing c3_cancel_public_events c4_restrictions_on_gatherings c5_close_public_transport c6_stay_at_home_requirements c7_restrictions_on_internal_movement c8_international_travel_controls e1_income_support e2_debt/contract_relief e3_fiscal_measures e4_international_support h1_public_information_campaigns h2_testing_policy h3_contact_tracing h4_emergency_investment_in_healthcare h5_investment_in_vaccines m1_wildcard stringencyindex stringencyindexfordisplay iso_country load_date
Aruba ABW 2020-01-01 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 null 0 0 AW 12/4/2020 12:06:44 AM
Aruba ABW 2020-01-02 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 null 0 0 AW 12/4/2020 12:06:44 AM
Aruba ABW 2020-01-03 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 null 0 0 AW 12/4/2020 12:06:44 AM
Aruba ABW 2020-01-04 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 null 0 0 AW 12/4/2020 12:06:44 AM
Aruba ABW 2020-01-05 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 null 0 0 AW 12/4/2020 12:06:44 AM
Aruba ABW 2020-01-06 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 null 0 0 AW 12/4/2020 12:06:44 AM
Aruba ABW 2020-01-07 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 null 0 0 AW 12/4/2020 12:06:44 AM
Aruba ABW 2020-01-08 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 null 0 0 AW 12/4/2020 12:06:44 AM
Aruba ABW 2020-01-09 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 null 0 0 AW 12/4/2020 12:06:44 AM
Aruba ABW 2020-01-10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 null 0 0 AW 12/4/2020 12:06:44 AM
Name Data type Unique Values (sample) Description
c1_flag boolean 3 True

Binary flag for geographic scope. 0 - targeted 1 - general Blank - no data

c1_school_closing double 5 3.0
2.0

Record closings of schools and universities. 0 - no measures 1 - recommend closing 2 - require closing (only some levels or categories, eg just high school, or just public schools) 3 - require closing all levels Blank - no data

c2_flag boolean 3 True

Binary flag for geographic scope. 0 - targeted; 1 - general; Blank - no data

c2_workplace_closing double 5 2.0
1.0

Record closings of workplaces. 0 - no measures 1 - recommend closing (or recommend work from home) 2 - require closing (or work from home) for some sectors or categories of workers 3 - require closing (or work from home) for all-but-essential workplaces (eg grocery stores, doctors) Blank - no data

c3_cancel_public_events double 4 2.0
1.0

Record cancelling public events. 0 - no measures 1 - recommend cancelling 2 - require cancelling Blank - no data

c3_flag boolean 3 True

Binary flag for geographic scope. 0 - targeted 1 - general Blank - no data

c4_flag boolean 3 True

Binary flag for geographic scope 0 - targeted 1 - general Blank - no data

c4_restrictions_on_gatherings double 6 4.0
3.0

Record limits on private gatherings. 0 - no restrictions 1 - restrictions on very large gatherings (the limit is above 1000 people) 2 - restrictions on gatherings between 101-1000 people 3 - restrictions on gatherings between 11-100 people 4 - restrictions on gatherings of 10 people or less Blank - no data

c5_close_public_transport double 4 1.0
2.0

Record closing of public transport 0 - no measures 1 - recommend closing (or significantly reduce volume/route/means of transport available) 2 - require closing (or prohibit most citizens from using it) Blank - no data

c5_flag boolean 3 True

Binary flag for geographic scope 0 - targeted 1 - general Blank - no data

c6_flag boolean 3 True

Binary flag for geographic scope 0 - targeted 1 - general Blank - no data

c6_stay_at_home_requirements double 5 1.0
2.0

Record orders to “shelter-in-place” and otherwise confine to the home 0 - no measures 1 - recommend not leaving house 2 - require not leaving house with exceptions for daily exercise, grocery shopping, and ‘essential’ trips 3 - require not leaving house with minimal exceptions (eg allowed to leave once a week, or only one person can leave at a time, etc) Blank - no data

c7_flag boolean 3 True

Binary flag for geographic scope 0 - targeted 1 - general Blank - no data

c7_restrictions_on_internal_movement double 4 2.0
1.0

Record restrictions on internal movement between cities/regions 0 - no measures 1 - recommend not to travel between regions/cities 2 - internal movement restrictions in place Blank - no data

c8_international_travel_controls double 6 3.0
4.0

Record restrictions on international travel. Note: this records policy for foreign travellers, not citizens 0 - no restrictions 1 - screening arrivals 2 - quarantine arrivals from some or all regions 3 - ban arrivals from some regions 4 - ban on all regions or total border closure Blank - no data

confirmedcases smallint 15,006 1
2
confirmeddeaths smallint 9,664 1
2
countrycode string 182 USA
BRA
countryname string 182 United States
Brazil
date date 335 2020-08-23
2020-08-25
e1_flag boolean 3 True

Binary flag for sectoral scope 0 - formal sector workers only 1 - transfers to informal sector workers too Blank - no data

e1_income_support double 4 1.0
2.0

Record if the government is providing direct cash payments to people who lose their jobs or cannot work. Note: only includes payments to firms if explicitly linked to payroll/salaries 0 - no income support 1 - government is replacing less than 50% of lost salary (or if a flat sum, it is less than 50% median salary) 2 - government is replacing 50% or more of lost salary (or if a flat sum, it is greater than 50% median salary) Blank - no data

e2_debt/contract_relief double 4 2.0
1.0
e3_fiscal_measures double 586 3.0
10000000.0

Announced economic stimulus spending Note: only record amount additional to previously announced spending Record monetary value in USD of fiscal stimuli, includes any spending or tax cuts NOT included in E4, H4 or H5 0 - no new spending that day Blank - no data

e4_international_support double 94 -0.02
5000000.0

Announced offers of Covid-19 related aid spending to other countries Note: only record amount additional to previously announced spending Record monetary value in USD 0 - no new spending that day Blank - no data

h1_flag boolean 3 True

Binary flag for geographic scope 0 - targeted 1 - general Blank - no data

h1_public_information_campaigns double 4 2.0
1.0

Record presence of public info campaigns 0 - no Covid-19 public information campaign 1 - public officials urging caution about Covid-19 2- coordinated public information campaign (eg across traditional and social media) Blank - no data

h2_testing_policy double 5 2.0
1.0

Record government policy on who has access to testing Note: this records policies about testing for current infection (PCR tests) not testing for immunity (antibody test) 0 - no testing policy 1 - only those who both (a) have symptoms AND (b) meet specific criteria (eg key workers, admitted to hospital, came into contact with a known case, returned from overseas) 2 - testing of anyone showing Covid-19 symptoms 3 - open public testing (eg “drive through” testing available to asymptomatic people) Blank - no data

h3_contact_tracing double 4 2.0
1.0

Record government policy on contact tracing after a positive diagnosis Note: we are looking for policies that would identify all people potentially exposed to Covid-19; voluntary bluetooth apps are unlikely to achieve this 0 - no contact tracing 1 - limited contact tracing; not done for all cases 2 - comprehensive contact tracing; done for all identified cases

h4_emergency_investment_in_healthcare double 355 35.0
562.0

Announced short term spending on healthcare system, eg hospitals, masks, etc Note: only record amount additional to previously announced spending Record monetary value in USD 0 - no new spending that day Blank - no data

h5_investment_in_vaccines double 80 1.0
191.0

Announced public spending on Covid-19 vaccine development Note: only record amount additional to previously announced spending Record monetary value in USD 0 - no new spending that day Blank - no data

iso_country string 182 US
BR

ISO 3166 country or region code

load_date timestamp 1 2020-12-04 00:06:44.272000

Date and time data was loaded from external source

stringencyindex double 184 11.11
5.56
stringencyindexfordisplay double 184 11.11
5.56

Select your preferred service:

Azure Notebooks

Azure Databricks

Azure Synapse

Azure Notebooks

Package: Language: Python
In [1]:
import pandas as pd
import numpy as np
%matplotlib inline
import matplotlib.pyplot as plt

df = pd.read_parquet("https://pandemicdatalake.blob.core.windows.net/public/curated/covid-19/covid_policy_tracker/latest/covid_policy_tracker.parquet")
df.head(10)
Out[1]:
countryname countrycode date c1_school_closing c1_flag c2_workplace_closing c2_flag c3_cancel_public_events c3_flag c4_restrictions_on_gatherings ... h5_investment_in_vaccines m1_wildcard confirmedcases confirmeddeaths stringencyindex stringencyindexfordisplay legacystringencyindex legacystringencyindexfordisplay iso_country load_date
0 Aruba ABW 2020-01-01 0.0 None 0.0 None 0.0 None 0.0 ... 0.0 None NaN NaN 0.0 0.0 NaN NaN AW 2020-06-22 00:04:51.168
1 Aruba ABW 2020-01-02 0.0 None 0.0 None 0.0 None 0.0 ... 0.0 None NaN NaN 0.0 0.0 NaN NaN AW 2020-06-22 00:04:51.168
2 Aruba ABW 2020-01-03 0.0 None 0.0 None 0.0 None 0.0 ... 0.0 None NaN NaN 0.0 0.0 NaN NaN AW 2020-06-22 00:04:51.168
3 Aruba ABW 2020-01-04 0.0 None 0.0 None 0.0 None 0.0 ... 0.0 None NaN NaN 0.0 0.0 NaN NaN AW 2020-06-22 00:04:51.168
4 Aruba ABW 2020-01-05 0.0 None 0.0 None 0.0 None 0.0 ... 0.0 None NaN NaN 0.0 0.0 NaN NaN AW 2020-06-22 00:04:51.168
5 Aruba ABW 2020-01-06 0.0 None 0.0 None 0.0 None 0.0 ... 0.0 None NaN NaN 0.0 0.0 NaN NaN AW 2020-06-22 00:04:51.168
6 Aruba ABW 2020-01-07 0.0 None 0.0 None 0.0 None 0.0 ... 0.0 None NaN NaN 0.0 0.0 NaN NaN AW 2020-06-22 00:04:51.168
7 Aruba ABW 2020-01-08 0.0 None 0.0 None 0.0 None 0.0 ... 0.0 None NaN NaN 0.0 0.0 NaN NaN AW 2020-06-22 00:04:51.168
8 Aruba ABW 2020-01-09 0.0 None 0.0 None 0.0 None 0.0 ... 0.0 None NaN NaN 0.0 0.0 NaN NaN AW 2020-06-22 00:04:51.168
9 Aruba ABW 2020-01-10 0.0 None 0.0 None 0.0 None 0.0 ... 0.0 None NaN NaN 0.0 0.0 NaN NaN AW 2020-06-22 00:04:51.168

10 rows × 38 columns

Lets check the data types of the various fields and verify that the updated column is datettime format

In [2]:
df.dtypes
Out[2]:
countryname                                      object
countrycode                                      object
date                                             object
c1_school_closing                               float64
c1_flag                                          object
c2_workplace_closing                            float64
c2_flag                                          object
c3_cancel_public_events                         float64
c3_flag                                          object
c4_restrictions_on_gatherings                   float64
c4_flag                                          object
c5_close_public_transport                       float64
c5_flag                                          object
c6_stay_at_home_requirements                    float64
c6_flag                                          object
c7_restrictions_on_internal_movement            float64
c7_flag                                          object
c8_international_travel_controls                float64
e1_income_support                               float64
e1_flag                                          object
e2_debt/contract_relief                         float64
e3_fiscal_measures                              float64
e4_international_support                        float64
h1_public_information_campaigns                 float64
h1_flag                                          object
h2_testing_policy                               float64
h3_contact_tracing                              float64
h4_emergency_investment_in_healthcare           float64
h5_investment_in_vaccines                       float64
m1_wildcard                                      object
confirmedcases                                  float64
confirmeddeaths                                 float64
stringencyindex                                 float64
stringencyindexfordisplay                       float64
legacystringencyindex                           float64
legacystringencyindexfordisplay                 float64
iso_country                                      object
load_date                                datetime64[ns]
dtype: object

This dataset contains data for the numerous countries. Lets verify what countries we have data for.

We will start by looking at the latest data for each country:

In [3]:
df.groupby('countryname').first().filter(['confirmedcases ', 'confirmeddeaths','h5_investment_in_vaccines',
    'c6_stay_at_home_requirements','h4_emergency_investment_in_healthcare','c4_restrictions_on_gatherings', 'load_date'])
Out[3]:
confirmeddeaths h5_investment_in_vaccines c6_stay_at_home_requirements h4_emergency_investment_in_healthcare c4_restrictions_on_gatherings load_date
countryname
Afghanistan 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Albania 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Algeria 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Andorra 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Angola 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Argentina 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Aruba 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Australia 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Austria 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Azerbaijan 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Bahrain 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Bangladesh 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Barbados 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Belarus 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Belgium 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Belize 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Benin 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Bermuda 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Bhutan 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Bolivia 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Bosnia and Herzegovina 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Botswana 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Brazil 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Brunei 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Bulgaria 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Burkina Faso 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Burundi 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Cambodia 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Cameroon 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Canada 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
... ... ... ... ... ... ...
Spain 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Sri Lanka 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Sudan 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Suriname 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Sweden 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Switzerland 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Syria 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Taiwan NaN 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Tajikistan 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Tanzania 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Thailand 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Timor-Leste 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Togo 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Trinidad and Tobago 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Tunisia 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Turkey 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Turkmenistan NaN 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Uganda 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Ukraine 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
United Arab Emirates 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
United Kingdom 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
United States 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Uruguay 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Uzbekistan 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Vanuatu NaN 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Venezuela 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Vietnam 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Yemen 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Zambia 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168
Zimbabwe 0.0 0.0 0.0 0.0 0.0 2020-06-22 00:04:51.168

178 rows × 6 columns

Next, we will do some aggregations to make sure columns such as confirmedcases and confirmeddeaths tally with the latest data. You should see that positive and death numbers for latest date in the above table match with the aggregation of confirmedcases and confirmeddeaths.

In [4]:
df.groupby('countryname').agg({'countryname': 'count','confirmedcases': 'sum','confirmeddeaths': 'sum',
                               'h5_investment_in_vaccines': 'count', 'c6_stay_at_home_requirements':'sum'})
Out[4]:
countryname confirmedcases confirmeddeaths h5_investment_in_vaccines c6_stay_at_home_requirements
countryname
Afghanistan 171 646085.0 12975.0 160 152.0
Albania 171 71583.0 2426.0 167 160.0
Algeria 171 473384.0 39530.0 163 194.0
Andorra 171 60691.0 3393.0 161 92.0
Angola 171 4298.0 236.0 168 142.0
Argentina 171 692525.0 28382.0 160 203.0
Aruba 171 7821.0 171.0 163 146.0
Australia 171 576253.0 6947.0 163 109.0
Austria 172 1306005.0 43849.0 170 92.0
Azerbaijan 171 290257.0 3501.0 167 187.0
Bahrain 171 558341.0 1169.0 163 86.0
Bangladesh 171 482941.0 30889.0 167 146.0
Barbados 171 6739.0 467.0 170 191.0
Belarus 171 621024.0 11263.0 155 0.0
Belgium 171 410786.0 582740.0 163 164.0
Belize 171 1339.0 145.0 162 80.0
Benin 171 14334.0 195.0 163 0.0
Bermuda 171 8765.0 533.0 170 112.0
Bhutan 171 1843.0 0.0 167 82.0
Bolivia 172 433347.0 15727.0 160 168.0
Bosnia and Herzegovina 171 154826.0 7919.0 164 159.0
Botswana 171 2186.0 79.0 163 112.0
Brazil 172 311884.0 668732.0 171 145.0
Brunei 171 12677.0 109.0 162 0.0
Bulgaria 171 149369.0 7476.0 161 111.0
Burkina Faso 171 57015.0 3498.0 163 166.0
Burundi 171 2728.0 66.0 162 0.0
Cambodia 171 10923.0 0.0 167 0.0
Cameroon 171 294639.0 9171.0 163 0.0
Canada 171 401386.0 352532.0 163 90.0
... ... ... ... ... ...
Spain 172 161444.0 1900647.0 163 164.0
Sri Lanka 171 75653.0 673.0 162 224.0
Sudan 171 193702.0 10817.0 163 136.0
Suriname 171 3459.0 107.0 163 150.0
Sweden 172 983888.0 236282.0 159 0.0
Switzerland 171 2406322.0 109997.0 163 122.0
Syria 171 5769.0 289.0 156 98.0
Taiwan 167 0.0 0.0 167 0.0
Tajikistan 171 133767.0 1791.0 163 28.0
Tanzania 171 29428.0 1171.0 170 0.0
Thailand 171 242701.0 4055.0 165 85.0
Timor-Leste 171 1551.0 0.0 166 79.0
Togo 171 20090.0 718.0 170 151.0
Trinidad and Tobago 171 9824.0 637.0 171 152.0
Tunisia 171 77140.0 3280.0 164 136.0
Turkey 171 195662.0 260080.0 168 194.0
Turkmenistan 160 0.0 0.0 160 0.0
Uganda 171 19230.0 0.0 163 123.0
Ukraine 171 1137386.0 33941.0 169 158.0
United Arab Emirates 171 797463.0 12694.0 167 119.0
United Kingdom 171 194116.0 830372.0 163 132.0
United States 171 97579.0 253418.0 167 184.0
Uruguay 171 55686.0 1307.0 167 106.0
Uzbekistan 171 206198.0 832.0 170 168.0
Vanuatu 167 0.0 0.0 166 82.0
Venezuela 170 73930.0 965.0 164 184.0
Vietnam 171 26271.0 0.0 167 42.0
Yemen 171 14962.0 3375.0 170 75.0
Zambia 171 42925.0 415.0 167 52.0
Zimbabwe 171 8223.0 287.0 156 141.0

178 rows × 5 columns

Lets do some basic visualizations for a few countries

In [5]:
import plotly.graph_objects as go
import plotly.express as px
import matplotlib.pyplot as plt

df.loc[: , ['countryname', 'confirmedcases', 
'confirmeddeaths']].groupby(['countryname']).max().sort_values(by='confirmedcases', 
                                           ascending=False).reset_index()[:15].style.background_gradient(cmap='rainbow')
Out[5]:
countryname confirmedcases confirmeddeaths
0 Ecuador 32763 4087
1 Portugal 32700 1524
2 South Africa 32683 1737
3 Netherlands 32655 6078
4 Egypt 32612 1938
5 Qatar 32604 86
6 United Arab Emirates 32532 298
7 Kuwait 32510 308
8 Ukraine 32476 966
9 Belarus 32426 331
10 Singapore 32343 26
11 Iran 32332 9272
12 Sweden 32172 5053
13 Pakistan 32081 3229
14 Bangladesh 32078 1343
In [6]:
df_US = df.groupby(df.date).agg({'confirmedcases': 'sum','confirmeddeaths':'sum'}).reset_index()

df_US.plot(kind='line',x='date',y="confirmedcases",grid=True)
df_US.plot(kind='line',x='date',y="confirmeddeaths",grid=True)
Out[6]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f65cb1d8860>
In [ ]:
 

Azure Databricks

Package: Language: Python
In [1]:
# Azure storage access info
blob_account_name = "pandemicdatalake"
blob_container_name = "public"
blob_relative_path = "curated/covid-19/covid_policy_tracker/latest/covid_policy_tracker.parquet"
blob_sas_token = r""
In [2]:
# Allow SPARK to read from Blob remotely
wasbs_path = 'wasbs://%s@%s.blob.core.windows.net/%s' % (blob_container_name, blob_account_name, blob_relative_path)
spark.conf.set(
  'fs.azure.sas.%s.%s.blob.core.windows.net' % (blob_container_name, blob_account_name),
  blob_sas_token)
print('Remote blob path: ' + wasbs_path)
In [3]:
# SPARK read parquet, note that it won't load any data yet by now
df = spark.read.parquet(wasbs_path)
print('Register the DataFrame as a SQL temporary view: source')
df.createOrReplaceTempView('source')
In [4]:
# Display top 10 rows
print('Displaying top 10 rows: ')
display(spark.sql('SELECT * FROM source LIMIT 10'))

Azure Synapse

Package: Language: Python
In [1]:
# Azure storage access info
blob_account_name = "pandemicdatalake"
blob_container_name = "public"
blob_relative_path = "curated/covid-19/covid_policy_tracker/latest/covid_policy_tracker.parquet"
blob_sas_token = r""
In [2]:
# Allow SPARK to read from Blob remotely
wasbs_path = 'wasbs://%s@%s.blob.core.windows.net/%s' % (blob_container_name, blob_account_name, blob_relative_path)
spark.conf.set(
  'fs.azure.sas.%s.%s.blob.core.windows.net' % (blob_container_name, blob_account_name),
  blob_sas_token)
print('Remote blob path: ' + wasbs_path)
In [3]:
# SPARK read parquet, note that it won't load any data yet by now
df = spark.read.parquet(wasbs_path)
print('Register the DataFrame as a SQL temporary view: source')
df.createOrReplaceTempView('source')
In [4]:
# Display top 10 rows
print('Displaying top 10 rows: ')
display(spark.sql('SELECT * FROM source LIMIT 10'))