Skip Navigation

NOAA Integrated Surface Data (ISD)

ISD NOAA Weather

Worldwide hourly weather history data (example: temperature, precipitation, wind) sourced from the National Oceanic and Atmospheric Administration (NOAA).

The Integrated Surface (ISD) Dataset (ISD) is composed of worldwide surface weather observations from over 35,000 stations, though the best spatial coverage is evident in North America, Europe, Australia, and parts of Asia. Parameters included are: air quality, atmospheric pressure, atmospheric temperature/dew point, atmospheric winds, clouds, precipitation, ocean waves, tides and more. ISD refers to the data contained within the digital database as well as the format in which the hourly, synoptic (3-hourly), and daily weather observations are stored.

Volume and Retention

This dataset is stored in Parquet format. It is updated daily, and contains about 400M rows (20GB) in total as of 2019.

This dataset contains historical records accumulated from 2008 to the present. You can use parameter settings in our SDK to fetch data within a specific time range.

Storage Location

This dataset is stored in the East US Azure region. Allocating compute resources in East US is recommended for affinity.

Additional Information

This dataset is sourced from NOAA Integrated Surface Database. Additional information about this dataset can be found here and here. Email if you have any questions about the data source.

Notices

MICROSOFT PROVIDES AZURE OPEN DATASETS ON AN “AS IS” BASIS. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, GUARANTEES OR CONDITIONS WITH RESPECT TO YOUR USE OF THE DATASETS. TO THE EXTENT PERMITTED UNDER YOUR LOCAL LAW, MICROSOFT DISCLAIMS ALL LIABILITY FOR ANY DAMAGES OR LOSSES, INCLUDING DIRECT, CONSEQUENTIAL, SPECIAL, INDIRECT, INCIDENTAL OR PUNITIVE, RESULTING FROM YOUR USE OF THE DATASETS.

This dataset is provided under the original terms that Microsoft received source data. The dataset may include data sourced from Microsoft.

Access

Available inWhen to use
Azure Notebooks

Quickly explore the dataset with Jupyter notebooks hosted on Azure or your local machine.

Azure Databricks

Use this when you need the scale of an Azure managed Spark cluster to process the dataset.

Preview

usaf wban datetime latitude longitude elevation stationName countryOrRegion p_k year day version month
722934 53121 1/16/2020 7:59:00 AM 33.219 -117.349 9 OCEANSIDE MUNICIPAL ARPT US 722934-53121 2020 16 1 1
727976 24217 1/16/2020 7:59:00 AM 48.794 -122.537 46 BELLINGHAM INTL AIRPORT US 727976-24217 2020 16 1 1
727850 24157 1/16/2020 7:59:00 AM 47.622 -117.528 721 SPOKANE INTERNATIONAL AIRPORT US 727850-24157 2020 16 1 1
722975 53141 1/16/2020 7:59:00 AM 33.79 -118.051 11 LOS ALAMITOS AAF AIRPORT US 722975-53141 2020 16 1 1
723830 23187 1/16/2020 7:59:00 AM 34.744 -118.724 1379 SANDBERG US 723830-23187 2020 16 1 1
722926 03154 1/16/2020 7:59:00 AM 33.3 -117.35 24 MARINE CORPS AIR STATION US 722926-03154 2020 16 1 1
727830 24149 1/16/2020 7:59:00 AM 46.375 -117.015 438 LEWISTON-NEZ PERCE COUNTY AIRPORT US 727830-24149 2020 16 1 1
745056 53120 1/16/2020 7:59:00 AM 33.038 -116.915 425 RAMONA AIRPORT US 745056-53120 2020 16 1 1
726817 24154 1/16/2020 7:59:00 AM 47.457 -115.645 1851 MULLAN PASS US 726817-24154 2020 16 1 1
722208 04224 1/16/2020 7:59:00 AM 48.708 -122.91 9 ORCAS ISLAND AIRPORT US 722208-04224 2020 16 1 1
Name Data type Unique Values (sample) Description
cloudCoverage string 8 CLR
OVC

The fraction of the sky covered by all the visible clouds. Cloud coverage values:

CLR = Clear skies FEW = Few clouds SCT = Scattered clouds BKN = Broken cloud cover OVC = Overcast OBS = Sky is obscured/can't be estimated POBS = Sky is partially obscured
countryOrRegion string 245 US
CA

Country or region code.

datetime timestamp 6,331,773 2018-04-12 12:00:00
2019-02-04 12:00:00

The UTC datetime of a GEOPHYSICAL-POINT-OBSERVATION.

day int 31 1
5

The day of the column datetime.

elevation double 2,354 5.0
3.0

The elevation of a GEOPHYSICAL-POINT-OBSERVATION relative to Mean Sea Level (MSL).

latitude double 34,566 38.544
34.822

The latitude coordinate of a GEOPHYSICAL-POINT-OBSERVATION where southern hemisphere is negative.

longitude double 57,743 -86.0
-96.622

The longitude coordinate of a GEOPHYSICAL-POINT-OBSERVATION where values west from 000000 to 179999 are signed negative.

month int 12 1
12

The month of the column datetime.

p_k string 17,256 999999-04223
999999-27516

usaf-wban

pastWeatherIndicator int 11 2
6

Retrieve past weather indicator, which shows weather in the past hour

0: Cloud covering 1/2 or less of the sky throughout the appropriate period 1: Cloud covering more than 1/2 of the sky during part of the appropriate period and covering 1/2 or less during part of the period 2: Cloud covering more than 1/2 of the sky throughout the appropriate period 3: Sandstorm, duststorm or blowing snow 4: Fog or ice fog or thick haze 5: Drizzle 6: Rain 7: Snow, or rain and snow mixed 8: Shower(s) 9: Thunderstorm(s) with or without precipitation
precipDepth double 5,574 9999.0
3.0

The depth of LIQUID-PRECIPITATION that is measured at the time of an observation. Units: millimeters. MIN: 0000; MAX: 9998; 9999 = Missing.

precipTime double 44 1.0
24.0

The quantity of time over which the LIQUID-PRECIPITATION was measured. Units: Hours. MIN: 00; MAX: 98; 99 = Missing.

presentWeatherIndicator int 101 10
5

Retrieve present weather indicator, which shows weather in the present hour

00: Cloud development not observed or not observable 01: Clouds generally dissolving or becoming less developed 02: State of sky on the whole unchanged 03: Clouds generally forming or developing 04: Visibility reduced by smoke, e.g. veldt or forest fires, industrial smoke or volcanic ashes 05: Haze 06: Widespread dust in suspension in the air, not raised by wind at or near the station at the time of observation 07: Dust or sand raised by wind at or near the station at the time of observation, but no well-developed dust whirl(s) sand whirl(s), and no duststorm or sandstorm seen or, in the case of ships, blowing spray at the station 08: Well developed dust whirl(s) or sand whirl(s) seen at or near the station during the preceding hour or at the time of observation, but no duststorm or sandstorm 09: Duststorm or sandstorm within sight at the time of observation, or at the station during the preceding hour For more: The section 'MW1' in ftp://ftp.ncdc.noaa.gov/pub/data/noaa/isd-format-document.pdf
seaLvlPressure double 2,214 1015.0
1015.2

The air pressure relative to Mean Sea Level (MSL).

MIN: 08600 MAX: 10900 UNITS: Hectopascals

snowDepth double 652 1.0
3.0

The depth of snow and ice on the ground. MIN: 0000 MAX: 1200 UNITS: centimeters

stationName string 16,546 DARRINGTON 21 NNE
NUNN 7 NNE

Name of the weather station.

temperature double 1,466 15.0
13.0

The temperature of the air. MIN: -0932 MAX: +0618 UNITS: Degrees Celsius

usaf string 16,575 999999
062350

AIR FORCE CATALOG station number.

version double 1 1.0
wban string 2,548 99999
04223

NCDC WBAN number.

windAngle int 362 180
270

The angle, measured in a clockwise direction, between true north and the direction from which the wind is blowing. MIN: 001 MAX: 360 UNITS: Angular Degrees

windSpeed double 613 2.1
1.5

The rate of horizontal travel of air past a fixed point.

MIN: 0000 MAX: 0900 UNITS: meters per second

year int 13 2018
2019

The year of the column datetime.

Select your preferred service:

Azure Notebooks

Azure Databricks

Azure Notebooks

Package: Language: Python Python
In [1]:
# This is a package in preview.
from azureml.opendatasets import NoaaIsdWeather

from datetime import datetime
from dateutil.relativedelta import relativedelta


end_date = datetime.today()
start_date = datetime.today() - relativedelta(months=1)

# Get historical weather data in the past month.
isd = NoaaIsdWeather(start_date, end_date)
# Read into Pandas data frame.
isd_df = isd.to_pandas_dataframe()
ActivityStarted, to_pandas_dataframe ActivityStarted, to_pandas_dataframe_in_worker Target paths: ['/year=2019/month=6/'] Looking for parquet files... Reading them into Pandas dataframe... Reading ISDWeather/year=2019/month=6/part-00049-tid-7654660707407597606-ec55d6c6-0d34-4a97-b2c8-d201080c9a98-89240.c000.snappy.parquet under container isdweatherdatacontainer Done. ActivityCompleted: Activity=to_pandas_dataframe_in_worker, HowEnded=Success, Duration=116905.15 [ms] ActivityCompleted: Activity=to_pandas_dataframe, HowEnded=Success, Duration=116907.63 [ms]
In [2]:
isd_df.info()
<class 'pandas.core.frame.DataFrame'> Int64Index: 7790719 entries, 2709 to 11337856 Data columns (total 22 columns): usaf object wban object datetime datetime64[ns] latitude float64 longitude float64 elevation float64 windAngle float64 windSpeed float64 temperature float64 seaLvlPressure float64 cloudCoverage object presentWeatherIndicator float64 pastWeatherIndicator float64 precipTime float64 precipDepth float64 snowDepth float64 stationName object countryOrRegion object p_k object year int32 day int32 version float64 dtypes: datetime64[ns](1), float64(13), int32(2), object(6) memory usage: 1.3+ GB
# Pip install packages
import os, sys

!{sys.executable} -m pip install azure-storage
!{sys.executable} -m pip install pyarrow
!{sys.executable} -m pip install pandas

# COMMAND ----------

# Azure storage access info
azure_storage_account_name = "azureopendatastorage"
azure_storage_sas_token = r""
container_name = "isdweatherdatacontainer"
folder_name = "ISDWeather/"

# COMMAND ----------

from azure.storage.blob import BlockBlobService

if azure_storage_account_name is None or azure_storage_sas_token is None:
    raise Exception("Provide your specific name and key for your Azure Storage account--see the Prerequisites section earlier.")

print('Looking for the first parquet under the folder ' + folder_name + ' in container "' + container_name + '"...')
blob_service = BlockBlobService(account_name = azure_storage_account_name, sas_token = azure_storage_sas_token,)
blobs = blob_service.list_blobs(container_name)
sorted_blobs = sorted(list(blobs), key=lambda e: e.name, reverse=True)
targetBlobName=''
for blob in sorted_blobs:
    if blob.name.startswith(folder_name) and blob.name.endswith('.parquet'):
        targetBlobName = blob.name
        break

print('Target blob to download: ' + targetBlobName)
_, filename = os.path.split(targetBlobName)
parquet_file=blob_service.get_blob_to_path(container_name, targetBlobName, filename)

# COMMAND ----------

# Read the local parquet file into Pandas data frame
import pyarrow.parquet as pq
import pandas as pd

appended_df = []
print('Reading the local parquet file into Pandas data frame')
df = pq.read_table(filename).to_pandas()

# COMMAND ----------

# you can add your filter at below
print('Loaded as a Pandas data frame: ')
df

# COMMAND ----------


Azure Databricks

Package: Language: Python Python
In [1]:
# This is a package in preview.
# You need to pip install azureml-opendatasets in Databricks cluster. https://docs.microsoft.com/en-us/azure/data-explorer/connect-from-databricks#install-the-python-library-on-your-azure-databricks-cluster
from azureml.opendatasets import NoaaIsdWeather

from datetime import datetime
from dateutil.relativedelta import relativedelta


end_date = datetime.today()
start_date = datetime.today() - relativedelta(months=1)
isd = NoaaIsdWeather(start_date, end_date)
isd_df = isd.to_spark_dataframe()
ActivityStarted, to_spark_dataframe ActivityStarted, to_spark_dataframe_in_worker ActivityCompleted: Activity=to_spark_dataframe_in_worker, HowEnded=Success, Duration=87171.59 [ms] ActivityCompleted: Activity=to_spark_dataframe, HowEnded=Success, Duration=87176.63 [ms]
In [2]:
display(isd_df.limit(5))
usafwbandatetimelatitudelongitudeelevationwindAnglewindSpeedtemperatureseaLvlPressurecloudCoveragepresentWeatherIndicatorpastWeatherIndicatorprecipTimeprecipDepthsnowDepthstationNamecountryOrRegionp_kyeardayversionmonth
726163547702019-06-30T21:38:00.000+000042.805-72.004317.0null2.617.2nullnull61null1.043.0nullJAFFREY MINI-SLVR RNCH APTUS726163-547702019301.06
726163547702019-06-30T21:52:00.000+000042.805-72.004317.0null1.517.21008.6nullnullnull1.043.0nullJAFFREY MINI-SLVR RNCH APTUS726163-547702019301.06
726163547702019-06-30T22:52:00.000+000042.805-72.004317.0null2.118.91008.8CLRnullnull1.00.0nullJAFFREY MINI-SLVR RNCH APTUS726163-547702019301.06
726163547702019-06-30T23:52:00.000+000042.805-72.004317.0null1.518.31009.1FEWnullnull6.094.0nullJAFFREY MINI-SLVR RNCH APTUS726163-547702019301.06
703260255032019-06-15T07:54:00.000+000058.683-156.65615.0704.110.01005.6null61null1.00.0nullKING SALMON AIRPORTUS703260-255032019151.06
# Databricks notebook source
# Azure storage access info
blob_account_name = "azureopendatastorage"
blob_container_name = "isdweatherdatacontainer"
blob_relative_path = "ISDWeather/"
blob_sas_token = r""

# COMMAND ----------

# Allow SPARK to read from Blob remotely
wasbs_path = 'wasbs://%s@%s.blob.core.windows.net/%s' % (blob_container_name, blob_account_name, blob_relative_path)
spark.conf.set(
  'fs.azure.sas.%s.%s.blob.core.windows.net' % (blob_container_name, blob_account_name),
  blob_sas_token)
print('Remote blob path: ' + wasbs_path)

# COMMAND ----------

# SPARK read parquet, note that it won't load any data yet by now
df = spark.read.parquet(wasbs_path)
print('Register the DataFrame as a SQL temporary view: source')
df.createOrReplaceTempView('source')

# COMMAND ----------

# Display top 10 rows
print('Displaying top 10 rows: ')
display(spark.sql('SELECT * FROM source LIMIT 10'))

Urban Heat Islands

From the Urban Innovation Initiative at Microsoft Research, data processing and analytics scripts for hourly NOAA weather station data that produce daily urban heat island indices for hundreds of U.S. cities, January 1, 2008 - present, including automated daily updating. Urban heat island effects are then examined over time and across cities, as well as aligned with population density.