NEXRAD 系統中最近的第 II 級資料。
NEXRAD (新一代雷達) 是美國 159 個雷達站組成的雷達網,由美國國家海洋暨大氣總署 (NOAA) 負責營運。 此資料集用於天氣預報和氣候科學。
該資料集可在 Azure 上使用都要歸功於 NOAA 巨量資料計畫。
您可使用最近 90 天的資料;較舊的資料可在封存儲存體中取得,也可根據要求提供 (請連絡 aiforearthdatasets@microsoft.com
)。
儲存體資源
資料儲存在美國東部資料中心的 Blob (每次掃描一個Blob),位於下列 Blob 容器中:
https://nexradsa.blob.core.windows.net/nexrad-l2
掃描遵循下列慣例進行:
https://nexradsa.blob.core.windows.net/nexrad-l2/year/month/day/station/filename
個別檔案名稱遵循下列慣例:
[station][year][month][day][time]
例如,下列檔案包含 1997 年 7 月 7 日 GMT 00:08.27 從KHPX 站進行的一次掃描:
https://nexradsa.blob.core.windows.net/nexrad-l2/1997/07/07/KHPX/KHPX19970707_000827.gz
您可以在 [資料存取] 下方提供的筆記本中,找到存取和繪製 NEXRAD 掃描的完整 Python 範例。
我們也提供唯讀 SAS (共用存取簽章) 權杖,以允許透過 BlobFuse 等方式存取 NEXRAD 資料,BlobFuse 可讓您將 Blob 容器作為磁碟機掛接:
https://nexradsa.blob.core.windows.net/nexrad-l2?st=2019-07-26T22%3A26%3A29Z&se=2034-07-27T22%3A26%3A00Z&sp=rl&sv=2018-03-28&sr=c&sig=oHaHPOVn3hs9Dm2WtAKAT64zmZkwwceam8XD8HSVrSg%3D
如需 Linux 的掛接指示,請前往這裡。
NEXRAD 資料可能耗用數百 TB,因此最好在儲存掃描結果的美國東部 Azure 資料中心執行大規模處理。 如果您將 NEXRAD 資料用於環境科學應用程式 (包括天氣預報),請考慮申請 AI for Earth 補助金,以支援您的計算需求。
索引
所有 NEXRAD 檔案的清單可從這裡取得,格式為 .txt 壓縮檔案:
https://nexradsa.blob.core.windows.net/nexrad-index/nexrad-index.zip
我們也維護 SQLite 資料庫,以利按位置和時間查詢影像;如需詳細資料,請參閱範例筆記本。
精美圖片
1991 年 6 月 5 日奧克拉荷馬市附近的天氣掃描。
Contact
對於此資料集如有任何問題,請連絡 aiforearthdatasets@microsoft.com
。
Access
Available in | When to use |
---|---|
Azure Notebooks | Quickly explore the dataset with Jupyter notebooks hosted on Azure or your local machine. |
Select your preferred service:
Azure Notebooks
# Standard-ish imports
import matplotlib.pyplot as plt
import warnings
import urllib.request
import tempfile
import os
import requests
import shutil
# Less standard, but still pip- or conda-installable
import sqlite3
import geopy.distance
# pip install progressbar2, not progressbar
import progressbar
# Suppress some warnings generated within pyart
warnings.filterwarnings('ignore',category=DeprecationWarning)
warnings.filterwarnings('ignore',category=FutureWarning)
warnings.filterwarnings('ignore',category=UserWarning)
import pyart
%matplotlib inline
# URL of our index file
index_db_url = 'https://nexradsa.blob.core.windows.net/nexrad-index/NEXRAD_sqllite.db'
# Temporary folder for data we need during execution of this notebook (we'll clean up
# at the end, we promise)
temp_dir = os.path.join(tempfile.gettempdir(),'nexrad')
os.makedirs(temp_dir,exist_ok=True)
# Local copy of the index file
index_db_file_name = os.path.join(temp_dir,'NEXRAD_sqllite.db')
class DownloadProgressBar():
"""
https://stackoverflow.com/questions/37748105/how-to-use-progressbar-module-with-urlretrieve
"""
def __init__(self):
self.pbar = None
def __call__(self, block_num, block_size, total_size):
if not self.pbar:
self.pbar = progressbar.ProgressBar(max_value=total_size)
self.pbar.start()
downloaded = block_num * block_size
if downloaded < total_size:
self.pbar.update(downloaded)
else:
self.pbar.finish()
def download_url(url, destination_filename=None, progress_updater=None, force_download=False):
"""
Download a URL to a temporary file
"""
# This is not intended to guarantee uniqueness, we just know it happens to guarantee
# uniqueness for this application.
if destination_filename is None:
url_as_filename = url.replace('://', '_').replace('.', '_').replace('/', '_')
destination_filename = \
os.path.join(temp_dir,url_as_filename)
if (not force_download) and (os.path.isfile(destination_filename)):
print('Bypassing download of already-downloaded file {}'.format(os.path.basename(url)))
return destination_filename
print('Downloading file {}'.format(os.path.basename(url)),end='')
urllib.request.urlretrieve(url, destination_filename, progress_updater)
assert(os.path.isfile(destination_filename))
nBytes = os.path.getsize(destination_filename)
print('...done, {} bytes.'.format(nBytes))
return destination_filename
def download_index_db():
"""
We have created an index (as SQLite db file) that tracks all records added to our NEXRAD
archive; this function will download that index (~40GB) to a local temporary file, if it
hasn't already been downloaded. This is a much more sensible thing to do inside the East
US data center than outside!
"""
if os.path.isfile(index_db_file_name):
print('Index file {} exists, bypassing download'.format(os.path.basename(index_db_file_name)))
return
else:
download_url(index_db_url, index_db_file_name, DownloadProgressBar())
def distance(lat1, lon1, lat2, lon2):
"""
Compute the distance in meters between two lat/lon coordinate pairs
"""
return geopy.distance.distance((lat1, lon1), (lat2, lon2)).m
def get_closest_coordinate(coordinate_list, lat, lon):
"""
Find the closest point in a list of lat/lon pairs, used here to find the closest radar
station to a given lat/lon pair.
"""
return min(coordinate_list, key=lambda p: distance(lat, lon, p['lat'], p['lon']))
def get_records(sql):
"""
Execute a SQL query on the index database; returns matching rows.
"""
download_index_db()
conn = sqlite3.connect(index_db_file_name)
with conn:
cursor = conn.execute(sql)
rows = cursor.fetchall()
return rows
def get_scans_for_nearest_station(lat, lon, start_date, end_date):
"""
Find all records in a given date range from the station closest to the
specified lat/lon pair.
"""
# ICAO is the for-letter code for the station, e.g. "KTLX"
sql = 'SELECT lat, lon, ICAO, name FROM station_latlon'
records = get_records(sql)
coordinate_list = []
for row in records:
coordinate_list.append({'lat': row[0], 'lon': row[1],
'icao': row[2], 'name': row[3]})
# Find the coordinates of the station closest to the given latitude and longitude
print('Searching for the nearest station to {},{}'.format(lat,lon))
closest_coordinate = get_closest_coordinate(coordinate_list, lat, lon)
print('Nearest station ({}, {}) found at {},{}'.format(
closest_coordinate['icao'], closest_coordinate['name'],
closest_coordinate['lat'], closest_coordinate['lon']))
# Get scans for the nearest station for a given date range
sql = '''SELECT * FROM station_index a INNER JOIN \
station_latlon b ON a.name = b.ICAO \
and (b.lat = {} and b.lon = {} and \
date(a.date_time) >= '{}' \
and date(a.date_time) <= '{}')'''.format(closest_coordinate['lat'],
closest_coordinate['lon'],
start_date, end_date)
files_info = get_records(sql)
return files_info
def display_scan(filename):
"""
Use PyART to plot a NEXRAD scan stored in [filename].
"""
radar = pyart.io.read_nexrad_archive(filename)
display = pyart.graph.RadarDisplay(radar)
fig = plt.figure()
ax = fig.add_subplot()
display.plot('reflectivity', 0, title='Reflectivity', ax=ax)
plt.show()
year = '2020'; month = '09'; day = '02'; station = 'KMXX'; time = '011116';
filename = station + year + month + day + '_' + time + '_V06.ar2v'
url = 'https://nexradsa.blob.core.windows.net/nexrad-l2/' + year + '/' + month + '/' + day + \
'/' + station + '/' + filename
filename = download_url(url)
display_scan(filename)
start_date = '2020-08-02'; end_date = '2020-08-15'
# Coordinates near Redmond, WA
lat = 47.6740; lon = -122.1215
# Find all files from the nearest station in the given date range
#
# The first time you call this function, it will download the ~40GB index file.
scan_files = get_scans_for_nearest_station(lat, lon, start_date, end_date)
# MDM files are not actually scans
scan_files = [s for s in scan_files if 'MDM' not in s[6]]
print('Found {} files near station: {}'.format(len(scan_files),scan_files[0][1]))
# Download the first scan
year = str(scan_files[0][2]); month = str(scan_files[0][3]); day = str(scan_files[0][4]);
station = scan_files[0][1]; filename = scan_files[0][6]
url = 'https://nexradsa.blob.core.windows.net/nexrad-l2/' + year.zfill(2) + '/' + \
month.zfill(2) + '/' + day.zfill(2) + \
'/' + station + '/' + filename
filename = download_url(url)
display_scan(filename)
shutil.rmtree(temp_dir)