略過導覽

NAIP

AerialImagery AIforEarth USDA

來自國家農業影像計畫 (NAIP) 的航照圖。

NAIP 提供全美國的高解析度航照圖。 此計畫由美國農業部 (USDA) 內的田野航空攝影辦公室 (AFPO) 管理。 可取得從 2010 年至今的資料。

儲存體資源

資料會在位於美國東部資料中心的 Azure Blob 儲存體中,在下列 Blob 容器儲存於最適合雲端的 GeoTIFF 檔案內:

https://naipblobs.blob.core.windows.net/naip

在該容器內,資料會根據下列項目進行組織:

v002/[state]/[year]/[state]_[resolution]_[year]/[quadrangle]/filename

例如:

v002/al/2015/al_100cm_2015/30086/m_3008601_ne_16_1_20150804.tif

這些欄位有更多詳細資料:

  • 年份:四位數的年份。 每 3-5 年會收集一次各州的影像,因此每個年份都會包含某幾個 (而不是全部的) 州。 例如,阿拉巴馬州具有 2011 年和 2013 年的資料,但沒有 2012 年的資料,而加利福尼亞州有 2012 年的資料,但沒有 2011 年或 2013 年的資料。 Esri 在其互動式 NAIP 年度涵蓋範圍地圖中提供 NAIP 涵蓋範圍的相關資訊。
  • :兩個字母的州代碼。
  • 解析度:影像解析度字串規格,會依不同 NAIP 歷程記錄而異。 視年份和州而定,可能為 “050cm”、“060cm” 或 “100cm”。
  • 四邊形USGS 四邊形識別碼,指定 7.5 分鐘 x 7.5 分鐘的區域。

檔案會儲存為最適合雲端的 GeoTIFF 影像,副檔名為 .tif。 這些檔案由 Esri (從 USDA 提供的原始格式) 產生及整理。

各個影像也可提供小型縮圖;將 “.tif” 替換為 “.200.jpg” 即可擷取縮圖。 例如:

https://naipblobs.blob.core.windows.net/naip/v002/al/2013/al_100cm_2013/30086/m_3008601_ne_16_1_20130928.200.jpg

您可以在 [“資料存取”] 下方提供的筆記本中,找到存取和繪製 NAIP 影像的完整 Python 範例。

我們也提供唯讀 SAS (共用存取簽章) 權杖,以允許透過 BlobFuse 等方式存取 NAIP 資料,這種方式可讓您將 Blob 容器掛接為磁碟機:

st=2019-07-18T03%3A53%3A22Z&se=2035-07-19T03%3A53%3A00Z&sp=rl&sv=2018-03-28&sr=c&sig=2RIXmLbLbiagYnUd49rgx2kOXKyILrJOgafmkODhRAQ%3D

如需 Linux 的掛接指示,請前往這裡

NAIP 資料可能耗用數百 TB,因此最好在儲存影像的美國東部 Azure 資料中心執行大規模處理。 如果您將 NAIP 資料用於環境科學應用程式,請考慮申請 AI for Earth 補助金,以支援您的計算需求。

索引

此處提供所有 NAIP 檔案的清單,格式為 .csv 壓縮檔案:

https://naipblobs.blob.core.windows.net/naip-index/naip_v002_index.zip

我們也會持續提供 rtree 物件,以輔助 Python 使用者進行空間查詢;如需詳細資料,請查看範例筆記本

您也可以在此處瀏覽資料。

.mrf 檔案有什麼變化?

我們在 2020 年 6 月更新了整個 NAIP 封存,以改善涵蓋範圍及可維護性。 我們也從 .mrf 格式切換為最適合雲端的 GeoTIFF,並對路徑結構進行了一些變更。 mrf 檔案仍可暫時在其他容器中使用;如果該檔案對您的工作至關重要,且您需要存取權,請連絡 aiforearthdatasets@microsoft.com

精美圖片


2017 年 Microsoft Redmond Campus 附近區域的 1 公尺解析度影像。

Contact

對於此資料集如有任何問題,請連絡 aiforearthdatasets@microsoft.com

Access

Available inWhen to use
Azure Notebooks

Quickly explore the dataset with Jupyter notebooks hosted on Azure or your local machine.

Select your preferred service:

Azure Notebooks

Azure Notebooks

Package: Language: Python

Demo notebook for accessing NAIP data on Azure

This notebook provides an example of accessing NAIP data from blob storage on Azure, displaying an image using the rasterio library.

We will demonstrate how to access and plot a tile given a known tile filename, as well as how to access tiles by lat/lon. Finally, we'll demonstrate how to retrieve only the patches you care about from our cloud-optimized image files.

NAIP data are stored in the West Europe Azure region, so this notebook will run most efficiently on Azure compute located in West Europe. We recommend that substantial computation depending on NAIP data also be situated in West Europe. You don't want to download hundreds of terabytes to your laptop! If you are using NAIP data for environmental science applications, consider applying for an AI for Earth grant to support your compute requirements.

Imports and environment

In [12]:
# Standard packages
import tempfile
import warnings
import urllib
import shutil
import os

# Less standard, but still pip- or conda-installable
import matplotlib.pyplot as plt
import numpy as np
import rasterio
import re
import rtree
import shapely
import pickle

# pip install progressbar2, not progressbar
import progressbar

from geopy.geocoders import Nominatim
from rasterio.windows import Window 
from tqdm import tqdm

latest_wkid = 3857
crs = "EPSG:4326"

# Storage locations are documented at http://aka.ms/ai4edata-naip

# The(preferred) copy of NAIP in the West Europe Azure region
blob_root = 'https://naipeuwest.blob.core.windows.net/naip'

# NAIP is also available in the East US Azure region
# blob_root = 'https://naipblobs.blob.core.windows.net/naip'

index_files = ["tile_index.dat", "tile_index.idx", "tiles.p"]
index_blob_root = re.sub('/naip$','/naip-index/rtree/',blob_root)
temp_dir = os.path.join(tempfile.gettempdir(),'naip')
os.makedirs(temp_dir,exist_ok=True)

# Spatial index that maps lat/lon to NAIP tiles; we'll load this when we first 
# need to access it.
index = None

# URL where we've stashed a geojson file with the boundaries of Maryland.  Why do we
# need the boundaries of Maryland?  It's a surprise, you'll have to keep reading to find
# out.
maryland_boundary_url = 'https://ai4edatasetspublicassets.blob.core.windows.net/assets/maryland.json'

warnings.filterwarnings("ignore")
%matplotlib inline

Functions

In [6]:
class DownloadProgressBar():
    """
    https://stackoverflow.com/questions/37748105/how-to-use-progressbar-module-with-urlretrieve
    """
    
    def __init__(self):
        self.pbar = None

    def __call__(self, block_num, block_size, total_size):
        if not self.pbar:
            self.pbar = progressbar.ProgressBar(max_value=total_size)
            self.pbar.start()
            
        downloaded = block_num * block_size
        if downloaded < total_size:
            self.pbar.update(downloaded)
        else:
            self.pbar.finish()
            

class NAIPTileIndex:
    """
    Utility class for performing NAIP tile lookups by location.
    """
    
    tile_rtree = None
    tile_index = None
    base_path = None
    
    def __init__(self, base_path=None):
        
        if base_path is None:
            
            base_path = temp_dir
            os.makedirs(base_path,exist_ok=True)
            
            for file_path in index_files:
                download_url(index_blob_root + file_path, base_path + '/' + file_path,
                             progress_updater=DownloadProgressBar())
                
        self.base_path = base_path
        self.tile_rtree = rtree.index.Index(base_path + "/tile_index")
        self.tile_index = pickle.load(open(base_path  + "/tiles.p", "rb"))
      
    
    def lookup_tile(self, lat, lon):
        """"
        Given a lat/lon coordinate pair, return the list of NAIP tiles that contain
        that location.

        Returns a list of COG file paths.
        """

        point = shapely.geometry.Point(float(lon),float(lat))
        intersected_indices = list(self.tile_rtree.intersection(point.bounds))

        intersected_files = []
        tile_intersection = False

        for idx in intersected_indices:

            intersected_file = self.tile_index[idx][0]
            intersected_geom = self.tile_index[idx][1]
            if intersected_geom.contains(point):
                tile_intersection = True
                intersected_files.append(intersected_file)

        if not tile_intersection and len(intersected_indices) > 0:
            print('''Error: there are overlaps with tile index, 
                      but no tile completely contains selection''')   
            return None
        elif len(intersected_files) <= 0:
            print("No tile intersections")
            return None
        else:
            return intersected_files
        
            
def download_url(url, destination_filename=None, progress_updater=None, force_download=False):
    """
    Download a URL to a temporary file
    """
    
    # This is not intended to guarantee uniqueness, we just know it happens to guarantee
    # uniqueness for this application.
    if destination_filename is None:
        url_as_filename = url.replace('://', '_').replace('/', '_')    
        destination_filename = \
            os.path.join(temp_dir,url_as_filename)
    if (not force_download) and (os.path.isfile(destination_filename)):
        print('Bypassing download of already-downloaded file {}'.format(os.path.basename(url)))
        return destination_filename
    print('Downloading file {} to {}'.format(os.path.basename(url),destination_filename),end='')
    urllib.request.urlretrieve(url, destination_filename, progress_updater)  
    assert(os.path.isfile(destination_filename))
    nBytes = os.path.getsize(destination_filename)
    print('...done, {} bytes.'.format(nBytes))
    return destination_filename
    

def display_naip_tile(filename):
    """
    Display a NAIP tile using rasterio.
    """
    
    # NAIP tiles are enormous; downsize for plotting in this notebook
    dsfactor = 10
    
    with rasterio.open(filename) as raster:

        # NAIP imagery has four channels: R, G, B, IR
        #
        # Stack RGB channels into an image; we won't try to render the IR channel
        #
        # rasterio uses 1-based indexing for channels.
        h = int(raster.height/dsfactor)
        w = int(raster.width/dsfactor)
        print('Resampling to {},{}'.format(h,w))
        r = raster.read(1, out_shape=(1, h, w))
        g = raster.read(2, out_shape=(1, h, w))
        b = raster.read(3, out_shape=(1, h, w))        
    
    rgb = np.dstack((r,g,b))
    fig = plt.figure(figsize=(7.5, 7.5), dpi=100, edgecolor='k')
    plt.imshow(rgb)
    raster.close()
    
    
def get_coordinates_from_address(address):
    """
    Look up the lat/lon coordinates for an address.
    """
    
    geolocator = Nominatim(user_agent="NAIP")
    location = geolocator.geocode(address)
    print('Retrieving location for address:\n{}'.format(location.address))
    return location.latitude, location.longitude

Access and plot a NAIP tile by constructing a path

In [7]:
# Tiles are stored at:
#
# [blob root]/v002/[state]/[year]/[state]_[resolution]_[year]/[quadrangle]/filename

year = '2015'
state = 'al'
resolution = '100cm'
quadrangle = '30086'
filename = 'm_3008601_ne_16_1_20150804.tif'
tile_url = blob_root + '/v002/' + state + '/' + year + '/' + state + '_' + resolution + \
    '_' + year + '/' + quadrangle + '/' + filename

print(tile_url)

# Download the image
image_filename = download_url(tile_url,progress_updater=DownloadProgressBar())

# Plot the image
print('Reading file:\n{}'.format(os.path.basename(image_filename)))
assert os.path.isfile(image_filename)
display_naip_tile(image_filename)
https://naipeuwest.blob.core.windows.net/naip/v002/al/2015/al_100cm_2015/30086/m_3008601_ne_16_1_20150804.tif
Bypassing download of already-downloaded file m_3008601_ne_16_1_20150804.tif
Reading file:
https_naipeuwest.blob.core.windows.net_naip_v002_al_2015_al_100cm_2015_30086_m_3008601_ne_16_1_20150804.tif
Resampling to 753,657

Load the spatial index of NAIP tiles

In [14]:
if index is None:
    index = NAIPTileIndex()

Access and plot a NAIP tile based on a lat/lon coordinate pair

In [16]:
lat = 47.645950
lon = -122.136980

naip_files = index.lookup_tile(lat, lon)

print('\nList of available naip files for this location:\n')
for file in naip_files:
    print(file)
print('')

# Choose the latest NAIP tile from this location
#
# Images are indexed in reverse-chronological order, so zero is the
# most recent image.