Skip navigation

UK Met Office Global Weather Data for COVID-19 Analysis

COVID-19 coronavirus Met Office environment Weather AIforEarth

This data is for COVID-19 researchers to explore relationships between COVID-19 and environmental factors.

For more information see our blog post. If you require compute resources to process this data we might be able to help.

Stay up to date

Stay up to date with new datasets, corrections, redactions, and other important information by subscribing to this data set’s mailing list.

License

Users are required to acknowledge the Met Office as the source of these data by including the following attribution statement in any resulting products, publications or applications: “Contains Met Office data licensed under the Open Government Licence v3.0”.

This data is made available under the Open Government License.

About the data

Global and high-resolution UK numerical weather model output from the UK Met Office. Data is from the very early time steps of the model following data assimilation, as such this data approximates a whole-Earth observation dataset.

The following variables are available:

  • t1o5m = Air temperature at 1.5m in K
  • sh = Specific humidity at 1.5m in kg/kg (kg of water vapor in kg of air)
  • sw = Short wave radiation in Wm − 2 (surrogate for sunshine)
  • precip = Precipitation flux in kgm − 2s − 1 (multiply by 3600 to get mm/hr)
  • rain = Rain flux in kgm − 2s − 1 (multiply by 3600 to get mm/hr)
  • pmsl = Air pressure at mean sea level in Pa
  • snow = Stratiform snowfall flux in kgm − 2s − 1 (multiply by 3600 to get mm/hr)
  • windspeed = Wind speed in ms − 1
  • windgust = Wind gust in ms − 1
  • cldbase = Cloud base altitude in feet
  • cldfrac = Cloud area fraction assuming maximum random overlap (unitless: 0-1)

Output of the Met Office UK air quality model AQUM is also available. This includes the following variables:

  • daqi = Daily Air Quality Index, an integer from 1-10
  • no2 = Nitrogen dioxide concentration, inµgm − 3
  • o3 = Ozone concentration, in µgm − 3
  • so2 = Sulphur dioxide concentration, in µgm − 3
  • pm2p5 = Concentration of particulate matter less than 2.5 microns diameter, in µgm − 3
  • pm10 = Concentration of particulate matter less than 10 microns diameter, in µgm − 3

This data is made available as NetCDF files.

Global and UK model data updated is available from 01 Jan 2020 onwards. The dataset is updated daily for the previous day.

For detailed information about how this data is generated and the particulars of the parameters please see the technical references:

There are some additional post-processed data aggregations over COVID-19 reporting regions in the UK and USA made available as CSV files. More details below.

Storage location

This dataset is stored in the East US 2 Azure region. Allocating compute resources in East US 2 is recommended for affinity.

Data volumes, retention, and update frequency

The gridded data is updated daily for the previous day.

As of 18/04/20 the dataset totals approximately 352G. It grows weekly by approximately 22G a week.

We intend to retain and make this data available as long as we believe it’s useful in planing the response to the COVID-19 pandemic.

Quick start

The data is hosted on Microsoft Azure through their AI for Earth initiative. You can access the data in many ways, such as:

Point and click

Open the index file in your browser. You will see a list of links to datafiles which you can download by clicking on them in your browser.

Azure Blob libraries

There is a range of libraries in a range of languages for working with Azure Blobs. See the Azure Blob documentation for more information.

Downloading with AzCopy

There are lots of files, so we suggest installing azcopy command line tool, which you can download here. This lets you download whole directories or multiple files using wildcards.

For example…

To download the file global_daily_precip_max_20200101.nc to the current directory:
azcopy cp https://metdatasa.blob.core.windows.net/covid19-response/metoffice_global_daily/precip_max/global_daily_precip_max_20200101.nc .

To download the contents of /metoffice_ukv_daily/snow_mean/ to ukv_daily_snow_mean/:
azcopy cp 'https://metdatasa.blob.core.windows.net/covid19-response/metoffice_ukv_daily/snow_mean/*' ukv_daily_snow_mean/

To download all the US state county-averaged meteorology data which match the pattern us_55*.csv:
azcopy cp --recursive --include-pattern 'us_55*.csv' https://metdatasa.blob.core.windows.net/covid19-response/regional_subset_data/us_data/ .

How the data is organised

metoffice_global_daily/

…contains the Met Office daily global gridded data files. There is a directory for each variable.

Each file is named according to global_daily_{variable}_{statistic}_{YYYYMMDD}.nc, for example:

metoffice_global_daily/precip_max/global_daily_precip_max_20200101.nc

…contains the gridded maximum precipitation data from Jan 1, 2020.

metoffice_global_hourly/

…contains the Met Office hourly global gridded data files.

Each file is named according to global_hourly_{variable}_global_{YYYYMMDD}.nc, for example:

metoffice_global_hourly/precip/global_hourly_precip_20200101.nc

…contains gridded hourly precipitation data from Jan 1, 2020.

metoffice_ukv_daily/

…contains the Met Office daily UKV gridded data files.

Each file is named according to ukv_daily_{variable}_{statistic}_{YYYYMMDD}.nc.

metoffice_ukv_hourly/

…contains the Met Office hourly UKV gridded data files.

Each file is named according to ukv_hourly_{variable}_{YYYYMMDD}.nc.

metoffice_aqum_daily/

…contains the Met Office daily AQUM gridded data files.

Each file is named according to aqum_daily_{variable}_{statistic}_{YYYYMMDD}.nc.

metoffice_aqum_hourly/

…contains the Met Office hourly AQUM gridded data files.

Each file is named according to aqum_hourly_{variable}_{YYYYMMDD}.nc.

regional_subset_data/

…contains processed regional daily values for the UK, the USA, Italy, Brazil, Vietnam, and Uganda as .csv files.

Processed files represent the period from Jan 1 to Apr 19, 2020. Files were processed by subsetting the gridded Met Office global daily files using shapefiles for each region, taking the latitude-longitude mean value for each variable in each region for each date and saving those values as a table in a .csv file.

Each file in this directory (one .csv file per region) is named according to {shapefile_name}/{shapefile_name}_metoffice_global_daily_bbox_{start_date}-{end_date}.csv.

For example, the file:

regional_subset_data/gadm36Uganda2/gadm36Uganda2_metoffice_global_daily_20200101-20200419.csv

…represents processed data for Uganda.

shapefiles/

Contains shapefiles for the UK, the USA, Italy, Brazil, Uganda, and Vietnam.

  • UK/ = UK COVID-19 reporting regions
  • USA/ = USA state counties
  • Italy/ = GADM v3.6 administrative level 2 for Italy
  • Brazil/ = GADM v3.6 administrative level 2 for Brazil
  • Uganda/ = GADM v3.6 administrative level 2 for Uganda
  • Vietnam/ = GADM v3.6 administrative level 2 for Vietnam

Where possible, filenames are as described. However, given the short time frames in which this data has been made available, minor variations in filename descriptions may occur. Filenames should still be accurately descriptive of the data. If you find issues with any filenames, or the data itself, please contact us at covid19@informaticslab.co.uk.

Getting help and contact

For help or additional data requests please contact us at covid19@informaticslab.co.uk.

Notices

MICROSOFT PROVIDES AZURE OPEN DATASETS ON AN “AS IS” BASIS. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, GUARANTEES OR CONDITIONS WITH RESPECT TO YOUR USE OF THE DATASETS. TO THE EXTENT PERMITTED UNDER YOUR LOCAL LAW, MICROSOFT DISCLAIMS ALL LIABILITY FOR ANY DAMAGES OR LOSSES, INCLUDING DIRECT, CONSEQUENTIAL, SPECIAL, INDIRECT, INCIDENTAL OR PUNITIVE, RESULTING FROM YOUR USE OF THE DATASETS.

Access

Available inWhen to use
Azure Notebooks

Quickly explore the dataset with Jupyter notebooks hosted on Azure or your local machine.

Select your preferred service:

Azure Notebooks

Azure Notebooks

Package: Language: Python

Met Office COVID-19 response dataset

This dataset is created, curated and updated for researchers looking to understand links between COVID-19 and environmental factors.

For more information check out our blog post and the data readme.

We are constantly updating the available data; subscribe to our news group to stay up to date or contact us if you have any requests or questions.

Imports and globals

Import the required modules, set up the default plot size and set some constants

In [1]:
import matplotlib.pyplot as plt 
import datetime

from azure.storage.blob import BlobClient, ContainerClient
from IPython.display import Markdown
from collections import namedtuple

%matplotlib inline
plt.rcParams['figure.figsize'] = (20.0, 10.0)

Set up the blob client with the connection details

In [2]:
account_url  = 'https://metdatasa.blob.core.windows.net/'
container_name = 'covid19-response'

# Create the ContainerClient object which will be used to enumerate blobs
container_client = ContainerClient(account_url=account_url,
                                   container_name=container_name,
                                   credential=None)

List the files under metoffice_global_daily/t1o5m_max

In [3]:
max_blobs = 10
for i_blob,blob in enumerate(container_client.list_blobs(
    name_starts_with='metoffice_global_daily/t1o5m_max')):
    print(f'{blob.name}')
    if i_blob >= max_blobs:
        break
metoffice_global_daily/t1o5m_max/global_daily_t1o5m_max_20200101.nc
metoffice_global_daily/t1o5m_max/global_daily_t1o5m_max_20200102.nc
metoffice_global_daily/t1o5m_max/global_daily_t1o5m_max_20200103.nc
metoffice_global_daily/t1o5m_max/global_daily_t1o5m_max_20200104.nc
metoffice_global_daily/t1o5m_max/global_daily_t1o5m_max_20200105.nc
metoffice_global_daily/t1o5m_max/global_daily_t1o5m_max_20200106.nc
metoffice_global_daily/t1o5m_max/global_daily_t1o5m_max_20200107.nc
metoffice_global_daily/t1o5m_max/global_daily_t1o5m_max_20200108.nc
metoffice_global_daily/t1o5m_max/global_daily_t1o5m_max_20200109.nc
metoffice_global_daily/t1o5m_max/global_daily_t1o5m_max_20200110.nc
metoffice_global_daily/t1o5m_max/global_daily_t1o5m_max_20200111.nc

Get a particular file based on the data required

In [4]:
data_end = (datetime.datetime.now() - datetime.timedelta(days=9)).date()
data_start = datetime.date(2020,1,1)

def url_from_properties(model, param, freq, stat=None, day=None, hour=None):
    
    assert model in ["global","ukv"]
    assert param in ["rain", "sh", "snow", "t1o5m", "pmsl","precip","sw"]
    assert freq in ["daily","hourly"]
    if freq == 'daily':
        assert stat in ['max', 'min', 'mean']
    else:
        assert stat is None  
    assert data_start <= day <= data_end
    
    stat = '_'+stat if stat else ''

    filepath = f'metoffice_{model}_{freq}/{param}{stat}/{model}_{freq}_{param}{stat}_{day:%Y%m%d}.nc'
    return f"{account_url}/{container_name}/{filepath}"

Properties = namedtuple('Properties',["model","param","freq","stat","day"])

files = [
    Properties("global","precip","daily","mean",datetime.date(2020,3,3)),
    Properties("ukv","t1o5m","daily","min",datetime.date(2020,4,1)),
    Properties("ukv","snow","hourly",None,datetime.date(2020,2,2)),
]

for file in files:
    path = url_from_properties(*file)
    print(path.replace(account_url,''))
/covid19-response/metoffice_global_daily/precip_mean/global_daily_precip_mean_20200303.nc
/covid19-response/metoffice_ukv_daily/t1o5m_min/ukv_daily_t1o5m_min_20200401.nc
/covid19-response/metoffice_ukv_hourly/snow/ukv_hourly_snow_20200202.nc

xarray and iris are useful tools for interacting with this sort of data

In [5]:
import xarray as xr
import iris
from io import BytesIO

Stream blob into memory and load dataset in xarray

In [6]:
data_description = Properties("global","precip","daily","mean",datetime.date(2020,1,30))
file_data = BytesIO(BlobClient.from_blob_url(
    url_from_properties(*data_description)).download_blob().readall())
ds = xr.open_dataset(file_data)
ds
Out[6]:
Show/Hide data repr Show/Hide attributes
xarray.Dataset
    • bnds: 2
    • latitude: 1920
    • longitude: 2560
    • latitude
      (latitude)
      float32
      -89.953125 -89.859375 ... 89.953125
    • longitude
      (longitude)
      float32
      0.0703125 0.2109375 ... 359.9297
    • forecast_period
      ()
      timedelta64[ns]
      ...
    • forecast_reference_time
      ()
      datetime64[ns]
      ...
    • time
      ()
      datetime64[ns]
      ...
    • precipitation_flux
      (latitude, longitude)
      float32
      ...
    • latitude_longitude
      ()
      int32
      ...
    • forecast_period_bnds
      (bnds)
      float64
      ...
    • forecast_reference_time_bnds
      (bnds)
      datetime64[ns]
      ...
    • time_bnds
      (bnds)
      datetime64[ns]
      ...
  • source :
    Data from Met Office Unified Model
    um_version :
    11.2
    Conventions :
    CF-1.5

Plot it with iris

In [ ]:
import tempfile

ds.precipitation_flux.plot()

tmp = tempfile.NamedTemporaryFile(delete=False)
data_description = Properties("ukv","sw","hourly",None,datetime.date(2020,1,30))
tmp.write(BlobClient.from_blob_url(
    url_from_properties(*data_description)).download_blob().readall())
local_path = tmp.name
tmp.close()

sw = iris.load_cube(local_path)
sw
Out[ ]:
M01S01I202 (1) forecast_period forecast_reference_time grid_latitude grid_longitude
Shape 6 4 808 621
Dimension coordinates
forecast_period x - - -
forecast_reference_time - x - -
grid_latitude - - x -
grid_longitude - - - x
Auxiliary coordinates
time x x - -
Attributes
Conventions CF-1.5
STASH m01s01i202
source Data from Met Office Unified Model
um_version 11.2