跳过导航

US Producer Price Index - Industry

labor statistics ppi industry

生产价格指数 (PPI) 是国内生产者产品出售价格随时间变化平均值的测量值。 PPI 中包含的价格来自相关产品和服务的首次商业交易。

生产价格指数修订当前序列指数反映了根据北美行业分类体系 (NAICS) 整理的生产者净产出的价格变动。 电脑数据集与一系列基于 NAICS 的经济时序(生产率、产量、就业、工资和收入)兼容。

PPI 涉及范围包括美国经济中产品生产部门的所有行业的产出,包括采矿业、制造业、农业、渔业和林业,以及天然气、电力、建筑和可与生产部门相竞争的产品行业,例如废物和废料处理行业。 此外,截至 2011 年 1 月,PPI 计划涵盖了服务业产出的四分之三以上,发布了以下行业部门中精选行业的数据:批发和零售贸易;运输和仓储;信息;金融和保险;房地产中介、租赁和出租;专业、科学和技术服务;行政、支持和废物管理服务;医疗保健和社会援助;以及住宿。

原始数据集位置提供了自述文件 ,其中包含介绍此数据集详细信息的文件。 常见问题解答中提供了其他信息。

此数据集来源于美国劳工统计局 (BLS) 发布的生产者价格指数数据。 要了解与使用此数据集相关的条款和条件,请查看链接与版权信息以及重要网站声明

存储位置

此数据集存储在美国东部 Azure 区域。 建议将计算资源分配到美国东部地区,以实现相关性。

相关数据集

通知

Microsoft 以“原样”为基础提供 AZURE 开放数据集。 Microsoft 对数据集的使用不提供任何担保(明示或暗示)、保证或条件。 在当地法律允许的范围内,Microsoft 对使用数据集而导致的任何损害或损失不承担任何责任,包括直接、必然、特殊、间接、偶发或惩罚。

此数据集是根据 Microsoft 接收源数据的原始条款提供的。 数据集可能包含来自 Microsoft 的数据。

Access

Available inWhen to use
Azure Notebooks

Quickly explore the dataset with Jupyter notebooks hosted on Azure or your local machine.

Azure Databricks

Use this when you need the scale of an Azure managed Spark cluster to process the dataset.

Azure Synapse

Use this when you need the scale of an Azure managed Spark cluster to process the dataset.

Preview

product_code industry_code series_id year period value footnote_codes seasonal series_title industry_name product_name
2123240 212324 PCU2123242123240 1998 M01 117 nan U PPI industry data for Kaolin and ball clay mining-Kaolin and ball clay, not seasonally adjusted Kaolin and ball clay mining Kaolin and ball clay
2123240 212324 PCU2123242123240 1998 M02 116.9 nan U PPI industry data for Kaolin and ball clay mining-Kaolin and ball clay, not seasonally adjusted Kaolin and ball clay mining Kaolin and ball clay
2123240 212324 PCU2123242123240 1998 M03 116.3 nan U PPI industry data for Kaolin and ball clay mining-Kaolin and ball clay, not seasonally adjusted Kaolin and ball clay mining Kaolin and ball clay
2123240 212324 PCU2123242123240 1998 M04 116 nan U PPI industry data for Kaolin and ball clay mining-Kaolin and ball clay, not seasonally adjusted Kaolin and ball clay mining Kaolin and ball clay
2123240 212324 PCU2123242123240 1998 M05 116.2 nan U PPI industry data for Kaolin and ball clay mining-Kaolin and ball clay, not seasonally adjusted Kaolin and ball clay mining Kaolin and ball clay
2123240 212324 PCU2123242123240 1998 M06 116.3 nan U PPI industry data for Kaolin and ball clay mining-Kaolin and ball clay, not seasonally adjusted Kaolin and ball clay mining Kaolin and ball clay
2123240 212324 PCU2123242123240 1998 M07 116.6 nan U PPI industry data for Kaolin and ball clay mining-Kaolin and ball clay, not seasonally adjusted Kaolin and ball clay mining Kaolin and ball clay
2123240 212324 PCU2123242123240 1998 M08 116.3 nan U PPI industry data for Kaolin and ball clay mining-Kaolin and ball clay, not seasonally adjusted Kaolin and ball clay mining Kaolin and ball clay
2123240 212324 PCU2123242123240 1998 M09 116.2 nan U PPI industry data for Kaolin and ball clay mining-Kaolin and ball clay, not seasonally adjusted Kaolin and ball clay mining Kaolin and ball clay
2123240 212324 PCU2123242123240 1998 M10 115.9 nan U PPI industry data for Kaolin and ball clay mining-Kaolin and ball clay, not seasonally adjusted Kaolin and ball clay mining Kaolin and ball clay
Name Data type Unique Values (sample) Description
footnote_codes string 3 nan
P

标识数据系列的脚注。 大多数值都为 Null。 请参阅 https://download.bls.gov/pub/time.series/pc/pc.footnote。

industry_code string 1,064 221122
325412

行业的 NAICS 代码。 有关代码和名称,请参阅 https://download.bls.gov/pub/time.series/pc/pc.industry 。

industry_name string 842 Electric power distribution
Pharmaceutical preparation manufacturing

对应于行业代码的名称。 有关代码和名称,请参阅 https://download.bls.gov/pub/time.series/pc/pc.industry 。

period string 13 M06
M07

标识观测数据的周期。 有关完整列表,请参阅 https://download.bls.gov/pub/time.series/pc/pc.period 。

product_code string 4,822 325212P
22112241

标识数据观察所引用产品的代码。 有关行业代码、产品代码和产品名称的映射,请参阅 https://download.bls.gov/pub/time.series/pc/pc.product 。

product_name string 3,313 Primary products
Secondary products

数据观测所引用的产品名称。 有关行业代码、产品代码和产品名称的映射,请参阅 https://download.bls.gov/pub/time.series/pc/pc.product 。

seasonal string 1 U

标识数据是否经过季节性调整的代码。 S = 季节性调整;U = 未经调整

series_id string 4,822 PCU3331313331319
PCU339993339993P

标识特定系列的代码。 时序是指在一致的时间间隔内在较长时间内观察到的一组数据。 有关时序详细信息(如代码、名称、开始年份和结束年份等),请参阅 https://download.bls.gov/pub/time.series/pc/pc.series 。

series_title string 4,588 PPI industry data for Electric power distribution-East North Central, not seasonally adjusted
PPI industry data for Electric power distribution-New England, not seasonally adjusted
value float 7,658 100.0
100.4000015258789

商品的价格指数。

year int 22 2015
2017

标识观测年份。

Select your preferred service:

Azure Notebooks

Azure Databricks

Azure Synapse

Azure Notebooks

Package: Language: Python Python
In [1]:
# This is a package in preview.
from azureml.opendatasets import UsLaborPPIIndustry

labor = UsLaborPPIIndustry()
labor_df = labor.to_pandas_dataframe()
ActivityStarted, to_pandas_dataframe
ActivityStarted, to_pandas_dataframe_in_worker
Looking for parquet files...
Reading them into Pandas dataframe...
Reading ppi_industry/part-00000-tid-1761562550540733469-da319923-1af6-4884-a5f4-16397508d15f-4596-1-c000.snappy.parquet under container laborstatisticscontainer
Done.
ActivityCompleted: Activity=to_pandas_dataframe_in_worker, HowEnded=Success, Duration=7978.44 [ms]
ActivityCompleted: Activity=to_pandas_dataframe, HowEnded=Success, Duration=8014.64 [ms]
In [2]:
labor_df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 948634 entries, 0 to 948633
Data columns (total 11 columns):
product_code      948634 non-null object
industry_code     948634 non-null object
series_id         948634 non-null object
year              948634 non-null int32
period            948634 non-null object
value             948634 non-null float32
footnote_codes    948634 non-null object
seasonal          948634 non-null object
series_title      948634 non-null object
industry_name     948634 non-null object
product_name      948634 non-null object
dtypes: float32(1), int32(1), object(9)
memory usage: 72.4+ MB
In [1]:
# Pip install packages
import os, sys

!{sys.executable} -m pip install azure-storage-blob
!{sys.executable} -m pip install pyarrow
!{sys.executable} -m pip install pandas
In [2]:
# Azure storage access info
azure_storage_account_name = "azureopendatastorage"
azure_storage_sas_token = r""
container_name = "laborstatisticscontainer"
folder_name = "ppi_industry/"
In [3]:
from azure.storage.blob import BlockBlobServicefrom azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient

if azure_storage_account_name is None or azure_storage_sas_token is None:
    raise Exception(
        "Provide your specific name and key for your Azure Storage account--see the Prerequisites section earlier.")

print('Looking for the first parquet under the folder ' +
      folder_name + ' in container "' + container_name + '"...')
container_url = f"https://{azure_storage_account_name}.blob.core.windows.net/"
blob_service_client = BlobServiceClient(
    container_url, azure_storage_sas_token if azure_storage_sas_token else None)

container_client = blob_service_client.get_container_client(container_name)
blobs = container_client.list_blobs(folder_name)
sorted_blobs = sorted(list(blobs), key=lambda e: e.name, reverse=True)
targetBlobName = ''
for blob in sorted_blobs:
    if blob.name.startswith(folder_name) and blob.name.endswith('.parquet'):
        targetBlobName = blob.name
        break

print('Target blob to download: ' + targetBlobName)
_, filename = os.path.split(targetBlobName)
blob_client = container_client.get_blob_client(targetBlobName)
with open(filename, 'wb') as local_file:
    blob_client.download_blob().download_to_stream(local_file)
In [4]:
# Read the parquet file into Pandas data frame
import pandas as pd

print('Reading the parquet file into Pandas data frame')
df = pd.read_parquet(filename)
In [5]:
# you can add your filter at below
print('Loaded as a Pandas data frame: ')
df
In [6]:
 

Azure Databricks

Package: Language: Python Python
In [1]:
# This is a package in preview.
from azureml.opendatasets import UsLaborPPIIndustry

labor = UsLaborPPIIndustry()
labor_df = labor.to_spark_dataframe()
ActivityStarted, to_spark_dataframe ActivityStarted, to_spark_dataframe_in_worker ActivityCompleted: Activity=to_spark_dataframe_in_worker, HowEnded=Success, Duration=2665.84 [ms] ActivityCompleted: Activity=to_spark_dataframe, HowEnded=Success, Duration=2668.22 [ms]
In [2]:
display(labor_df.limit(5))
product_codeindustry_codeseries_idyearperiodvaluefootnote_codesseasonalseries_titleindustry_nameproduct_name
2123240212324PCU2123242123240 1998M01117.0nanUPPI industry data for Kaolin and ball clay mining-Kaolin and ball clay, not seasonally adjustedKaolin and ball clay miningKaolin and ball clay
2123240212324PCU2123242123240 1998M02116.9nanUPPI industry data for Kaolin and ball clay mining-Kaolin and ball clay, not seasonally adjustedKaolin and ball clay miningKaolin and ball clay
2123240212324PCU2123242123240 1998M03116.3nanUPPI industry data for Kaolin and ball clay mining-Kaolin and ball clay, not seasonally adjustedKaolin and ball clay miningKaolin and ball clay
2123240212324PCU2123242123240 1998M04116.0nanUPPI industry data for Kaolin and ball clay mining-Kaolin and ball clay, not seasonally adjustedKaolin and ball clay miningKaolin and ball clay
2123240212324PCU2123242123240 1998M05116.2nanUPPI industry data for Kaolin and ball clay mining-Kaolin and ball clay, not seasonally adjustedKaolin and ball clay miningKaolin and ball clay
In [1]:
# Azure storage access info
blob_account_name = "azureopendatastorage"
blob_container_name = "laborstatisticscontainer"
blob_relative_path = "ppi_industry/"
blob_sas_token = r""
In [2]:
# Allow SPARK to read from Blob remotely
wasbs_path = 'wasbs://%s@%s.blob.core.windows.net/%s' % (blob_container_name, blob_account_name, blob_relative_path)
spark.conf.set(
  'fs.azure.sas.%s.%s.blob.core.windows.net' % (blob_container_name, blob_account_name),
  blob_sas_token)
print('Remote blob path: ' + wasbs_path)
In [3]:
# SPARK read parquet, note that it won't load any data yet by now
df = spark.read.parquet(wasbs_path)
print('Register the DataFrame as a SQL temporary view: source')
df.createOrReplaceTempView('source')
In [4]:
# Display top 10 rows
print('Displaying top 10 rows: ')
display(spark.sql('SELECT * FROM source LIMIT 10'))

Azure Synapse

Package: Language: Python
In [1]:
# Azure storage access info
blob_account_name = "azureopendatastorage"
blob_container_name = "laborstatisticscontainer"
blob_relative_path = "ppi_industry/"
blob_sas_token = r""
In [2]:
# Allow SPARK to read from Blob remotely
wasbs_path = 'wasbs://%s@%s.blob.core.windows.net/%s' % (blob_container_name, blob_account_name, blob_relative_path)
spark.conf.set(
  'fs.azure.sas.%s.%s.blob.core.windows.net' % (blob_container_name, blob_account_name),
  blob_sas_token)
print('Remote blob path: ' + wasbs_path)
In [3]:
# SPARK read parquet, note that it won't load any data yet by now
df = spark.read.parquet(wasbs_path)
print('Register the DataFrame as a SQL temporary view: source')
df.createOrReplaceTempView('source')
In [4]:
# Display top 10 rows
print('Displaying top 10 rows: ')
display(spark.sql('SELECT * FROM source LIMIT 10'))