Skip Navigation

Daymet

Weather Daymet AIforEarth

Estimates of daily weather parameters in North America on a one-kilometer grid, with monthly and annual summaries.

Daymet provides measurements of near-surface meteorological conditions; the main purpose of Daymet is provide data estimates where no instrumentation exists. This dataset provides Daymet Version 4 data for North America, including the island areas of Hawaii and Puerto Rico (which are available as files separate from the continental land mass). Daymet output variables include minimum temperature, maximum temperature, precipitation, shortwave radiation, vapor pressure, snow water equivalent, and day length. The dataset covers the period from January 1, 1980 to the present. Each year is processed individually at the close of a calendar year. Daymet variables are continuous surfaces provided as individual files, by variable and year, at 1-kilometer spatial resolution and daily temporal resolution. Data are in a Lambert Conformal Conic projection for North America and are distributed in Zarr format and netCDF format compliant with Climate and Forecast (CF) metadata conventions (version 1.6).

We also provide the monthly and annual climate summaries.

Storage resources

Data are stored in blobs in the West Europe Azure region, in two formats: Zarr and netCDF, in the following blob containers:

  • https://daymeteuwest.blob.core.windows.net/daymet-zarr
  • https://daymeteuwest.blob.core.windows.net/daymet-nc

We recommend the Zarr format if you’re analyzing part or all of the data directly from Azure Blob Storage, without downloading the data locally. If you’re downloading an entire block of the data, a specific variable for a specific time period, then either Zarr or netCDF format is appropriate. See the “example notebooks” for examples of how to use both formats.

Zarr Layout

Zarr files are named as:

daymet-zarr/[frequency]/[region].zarr

  • frequency is one of (daily, monthly, annual)
  • region is one of (hi, na, and pr), for Hawaii, North America (continental), and Puerto Rico, respectively

All of the data variables (tmin, tmax, etc.) are available within each group. See below for documentation of the available variables.

For example:

  • daymet-zarr/daily/hi.zarr (Hawaii at daily frequency)
  • daymet-zarr/monthly/na.zarr (North America at monthly frequency)
  • daymet-zarr/annual/pr.zarr (Puerto Rico at daily frequency)

netCDF Layout

The netCDF files are named as:

daymet-nc/daymet_v4_[frequency]/daymet_v4_[variable]_[frequency]_[region]_[year].nc

region has the same definition for netCDF as for Zarr (see above).

frequency is one of:

  • daily (daily totals)
  • monttl (monthly totals)
  • monavg (monthly averages)
  • annttl (annual totals)
  • annavg (annual averages)

For example:

  • daymet-nc/daymet_v4_daily/daymet_v4_daily_hi_prcp_1980.nc (daily precipitation in Hawaii for 1980)
  • daymet-nc/daymet_v4_monthly/daymet_v4_prcp_monttl_pr_1980.nc (monthly total precipitation in Puerto Rico in 1980)
  • daymet-nc/daymet_v4_annual/daymet_v4_prcp_annavg_na_1980.nc (annual average precipitation for North America in 1980)

See the Daymet User Guide for more details.

Variables

The following variables are available:

  • tmin (minimum temperature)
  • tmax (maximum temperature)
  • prcp (precipitation)
  • srad (shortwave radiation)
  • vp (vapor pressure)
  • swe (snow water equivalent)
  • dayl (day length, daily frequency data only)

Access

Complete Python examples of accessing and plotting Daymet data in both Zarr and NetCDF foramts are available under “data access”.

We also provide a read-only SAS (shared access signature) token to allow access to Daymet data via, e.g., BlobFuse, which allows you to mount blob containers as drives:

?sv=2019-12-12&si=daymet-ro&sr=c&sig=V85jbO5Ajj46%2BOwM3SYIA2MfXFvr8qq6Hvse3U9kJfc%3D

Mounting instructions for Linux are here.

Large-scale processing using this dataset is best performed in the West Europe Azure data center, where the data is stored. If you are using Daymet data for environmental science applications, consider applying for an AI for Earth grant to support your compute requirements.

A copy of the Daymet v4 NetCDF data is also available in the East US Azure region, and will be maintained there until at least the end of 2021, but we encourage users to migrate to the West Europe copy. The East US data is in the following container:

https://daymet.blob.core.windows.net/daymet-nc

Citation

If you use this data in a publication, please cite one of the following (depending on whether you’re using daily, monthly, or annual data):

  • Thornton, M.M., R. Shrestha, Y. Wei, P.E. Thornton, S. Kao, and B.E. Wilson. 2020. Daymet: Daily Surface Weather Data on a 1-km Grid for North America, Version 4. ORNL DAAC, Oak Ridge, Tennessee, USA. https://doi.org/10.3334/ORNLDAAC/1840
  • Thornton, M.M., R. Shrestha, Y. Wei, P.E. Thornton, S. Kao, and B.E. Wilson. 2020. Daymet: Monthly Climate Summaries on a 1-km Grid for North America, Version 4. ORNL DAAC, Oak Ridge, Tennessee, USA. https://doi.org/10.3334/ORNLDAAC/1855
  • Thornton, M.M., R. Shrestha, Y. Wei, P.E. Thornton, S. Kao, and B.E. Wilson. 2020. Daymet: Annual Climate Summaries on a 1-km Grid for North America, Version 4. ORNL DAAC, Oak Ridge, Tennessee, USA. https://doi.org/10.3334/ORNLDAAC/1852

See the Oak Ridge National Laboratory Distributed Active Archive Center (ORNL DAAC)’s Data Use and Citations Policy for more information.

Resources

The following resources and references may be helpful when working with the Daymet dataset:

Pretty picture


Average daily maximum temperature in Hawaii in 2017.

Contact

For questions about this dataset, contact aiforearthdatasets@microsoft.com.

Notices

MICROSOFT PROVIDES AZURE OPEN DATASETS ON AN “AS IS” BASIS. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, GUARANTEES OR CONDITIONS WITH RESPECT TO YOUR USE OF THE DATASETS. TO THE EXTENT PERMITTED UNDER YOUR LOCAL LAW, MICROSOFT DISCLAIMS ALL LIABILITY FOR ANY DAMAGES OR LOSSES, INCLUDING DIRECT, CONSEQUENTIAL, SPECIAL, INDIRECT, INCIDENTAL OR PUNITIVE, RESULTING FROM YOUR USE OF THE DATASETS.

This dataset is provided under the original terms that Microsoft received source data. The dataset may include data sourced from Microsoft.

Access

Available inWhen to use
Azure Notebooks

Quickly explore the dataset with Jupyter notebooks hosted on Azure or your local machine.

Select your preferred service:

Azure Notebooks

Azure Notebooks

Package: Language: Python

Accessing zarr-formatted Daymet data on Azure

The Daymet dataset contains daily minimum temperature, maximum temperature, precipitation, shortwave radiation, vapor pressure, snow water equivalent, and day length at 1km resolution for North America. The dataset covers the period from January 1, 1980 to December 31, 2019.

The Daymet dataset is maintained at daac.ornl.gov/cgi-bin/dsviewer.pl?ds_id=1328 and mirrored on Azure Open Datasets at aka.ms/ai4edata-daymet. Azure also provides a cloud-optimized version of the data in Zarr format, which can be read into an xarray Dataset. If you just need a subset of the data, we recommend using xarray and Zarr to avoid downloading the full dataset unnecessarily.

The datasets are available in the daymeteuwest storage account, in the daymet-zarr container. Files are named according to daymet-zarr/{frequency}/{region}.zarr, where frequency is one of {daily, monthly, annual} and region is one of {hi, na, pr} (for Hawaii, CONUS, and Puerto Rico, respectively). For example, daymet-zarr/daily/hi.zarr.

In [1]:
# Standard or standard-ish imports
import warnings
import matplotlib.pyplot as plt

# Less standard, but still pip- or conda-installable
import xarray as xr
import fsspec

# Neither of these are accessed directly, but both need to be installed; they're used
# via fsspec
import adlfs
import zarr

account_name = 'daymeteuwest'
container_name = 'daymet-zarr'

Load data into an xarray Dataset

We can lazily load the data into an xarray.Dataset by creating a zarr store with fsspec and then reading it in with xarray. This only reads the metadata, so it's safe to call on a dataset that's larger than memory.

In [2]:
store = fsspec.get_mapper('az://' + container_name + '/daily/hi.zarr', account_name=account_name)
# consolidated=True speeds of reading the metadata
ds = xr.open_zarr(store, consolidated=True)
ds
Out[2]:

Working with the data

Using xarray, we can quickly select subsets of the data, perform an aggregation, and plot the result. For example, we'll plot the average of the maximum temperature for the year 2009.

In [3]:
warnings.simplefilter("ignore", RuntimeWarning)
fig, ax = plt.subplots(figsize=(12, 12))
ds.sel(time="2009")["tmax"].mean(dim="time").plot.imshow(ax=ax, cmap="inferno");

Or we can visualize the timeseries of the minimum temperature over the past decade.

In [4]:
fig, ax = plt.subplots(figsize=(12, 6))
ds.sel(time=slice("2010", "2019"))['tmin'].mean(dim=["x", "y"]).plot(ax=ax);

Chunking

Each of the datasets is chunked to allow for parallel and out-of-core or distributed processing with Dask. The different frequencies (daily, monthly, annual) are chunked so that each year is in a single chunk. The different regions in the x and y coordinates so that no single chunk is larger than about 250 MB, which is primarily important for the na region.

In [5]:
ds['prcp']
Out[5]:

So our prcp array has a shape (14600, 584, 284) where each chunk is (365, 584, 284). Examining the store for monthly North America, we see the chunk each of a size of (12, 1250, 1250).

In [6]:
na_store = fsspec.get_mapper("az://" + container_name + "/monthly/na.zarr",
                             account_name=account_name)
na = xr.open_zarr(na_store, consolidated=True)
na['prcp']
Out[6]:

See http://xarray.pydata.org/en/stable/dask.html for more on how xarray uses Dask for parallel computing.