Spring over navigation

US Consumer Price Index

labor statistics cpi

CPI’et (Consumer Price Index) er en måling af den gennemsnitlige ændring over tid i de priser, som byforbrugere betaler for en markedskurv med forbrugsvarer og tjenester.

VIGTIGT -filen, der indeholder en fil med detaljerede oplysninger om dette datasæt, fås på placeringen for det oprindelige datasæt

Dette datasæt stammer fraforbrugerprisindeksdata, der udgives af de amerikanske myndigheder for arbejdsstatistik. Gennemse oplysninger om linkning og ophavsret og vigtige meddelelser om websted for at læse de vilkår og betingelser, der gælder for brug af dette datasæt.

Lagerplacering

Dette datasæt er gemt i Azure-området Det østlige USA. Tildeling af beregningsressourcer i det østlige USA anbefales af tilhørsmæssige årsager.

Relaterede datasæt

Meddelelser

MICROSOFT STILLER AZURE OPEN DATASETS TIL RÅDIGHED, SOM DE ER OG FOREFINDES. MICROSOFT FRASKRIVER SIG ETHVERT ANSVAR, UDTRYKKELIGT ELLER STILTIENDE, OG GARANTIER ELLER BETINGELSER MED HENSYN TIL BRUGEN AF DATASÆTTENE. I DET OMFANG DET ER TILLADT I HENHOLD TIL GÆLDENDE LOVGIVNING FRASKRIVER MICROSOFT SIG ETHVERT ANSVAR FOR SKADER ELLER TAB, INKLUSIVE DIREKTE, FØLGESKADER, SÆRLIGE SKADER, INDIREKTE SKADER, HÆNDELIGE SKADER ELLER PONALE SKADER, DER MÅTTE OPSTÅ I FORBINDELSE MED BRUG AF DATASÆTTENE.

Dette datasæt stilles til rådighed under de oprindelige vilkår, som Microsoft modtog kildedataene under. Datasættet kan indeholde data fra Microsoft.

Access

Available inWhen to use
Azure Notebooks

Quickly explore the dataset with Jupyter notebooks hosted on Azure or your local machine.

Azure Databricks

Use this when you need the scale of an Azure managed Spark cluster to process the dataset.

Azure Synapse

Use this when you need the scale of an Azure managed Spark cluster to process the dataset.

Preview

area_code item_code series_id year period value footnote_codes seasonal periodicity_code series_title item_name area_name
S49E SEHF01 CUURS49ESEHF01 2017 M12 279.974 nan U R Electricity in San Diego-Carlsbad, CA, all urban consumers, not seasonally adjusted Electricity San Diego-Carlsbad, CA
S49E SEHF01 CUURS49ESEHF01 2017 M12 279.974 nan U R Electricity in San Diego-Carlsbad, CA, all urban consumers, not seasonally adjusted Electricity San Diego-Carlsbad, CA
S49E SEHF01 CUURS49ESEHF01 2017 M12 279.974 nan U R Electricity in San Diego-Carlsbad, CA, all urban consumers, not seasonally adjusted Electricity San Diego-Carlsbad, CA
S49E SEHF01 CUURS49ESEHF01 2017 M12 279.974 nan U R Electricity in San Diego-Carlsbad, CA, all urban consumers, not seasonally adjusted Electricity San Diego-Carlsbad, CA
S49E SEHF01 CUURS49ESEHF01 2017 M12 279.974 nan U R Electricity in San Diego-Carlsbad, CA, all urban consumers, not seasonally adjusted Electricity San Diego-Carlsbad, CA
S49E SEHF01 CUURS49ESEHF01 2017 M12 279.974 nan U R Electricity in San Diego-Carlsbad, CA, all urban consumers, not seasonally adjusted Electricity San Diego-Carlsbad, CA
S49E SEHF01 CUURS49ESEHF01 2018 M01 284.456 nan U R Electricity in San Diego-Carlsbad, CA, all urban consumers, not seasonally adjusted Electricity San Diego-Carlsbad, CA
S49E SEHF01 CUURS49ESEHF01 2018 M01 284.456 nan U R Electricity in San Diego-Carlsbad, CA, all urban consumers, not seasonally adjusted Electricity San Diego-Carlsbad, CA
S49E SEHF01 CUURS49ESEHF01 2018 M01 284.456 nan U R Electricity in San Diego-Carlsbad, CA, all urban consumers, not seasonally adjusted Electricity San Diego-Carlsbad, CA
S49E SEHF01 CUURS49ESEHF01 2018 M01 284.456 nan U R Electricity in San Diego-Carlsbad, CA, all urban consumers, not seasonally adjusted Electricity San Diego-Carlsbad, CA
Name Data type Unique Values (sample) Description
area_code string 70 0000
0300

Entydig kode, der bruges til at identificere et bestemt geografisk område. De komplette områdekoder ses her: http://download.bls.gov/pub/time.series/cu/cu.area

area_name string 69 U.S. city average
South

Navn på et bestemt geografisk område. Se alle områdenavne og -koder i https://download.bls.gov/pub/time.series/cu/cu.area.

footnote_codes string 3 nan
U

Identificerer fodnoter for dataserierne. De fleste værdier er null.

item_code string 515 SA0E
SAF11

Identificerer den vare, som dataobservationerne angår. Se alle elementnavne og -koder i https://download.bls.gov/pub/time.series/cu/cu.item.

item_name string 515 Energy
Food at home

Varernes fulde navn. Se elementnavne og -koder i https://download.bls.gov/pub/time.series/cu/cu.txt.

period string 16 S01
S02

Identificerer de perioder, hvor dataene blev observeret. Format: M01-M13 eller S01-S03 (M = månedligt, M13 = årligt gennemsnit, S = halvårligt). Eks.: M06 = juni. Se periodenavne og -koder i https://download.bls.gov/pub/time.series/cu/cu.period.

periodicity_code string 3 R
S

Hyppighed for dataobservation. S = halvårlig; R = regelmæssigt.

seasonal string 1,043 U
S

Kode, der identificerer, om dataene er justeret efter sæson. S = justeret efter sæson; U = ikke-justeret.

series_id string 16,683 CUURS300SAD
CUURS300SAF11

Kode, der identificerer de forskellige serier. En tidsserie refererer til et sæt data, der er blevet observeret over en længere periode med jævne tidsintervaller (dvs. månedligt, kvartalsvis, halvårligt og årligt). BLS-tidsseriedata produceres typisk med månedlige intervaller og repræsenterer data lige fra en bestemt forbrugerartikel i et bestemt geografisk område, hvis pris indsamles månedligt, til en kategori af arbejdstagere i en bestemt branche, hvis beskæftigelsesrate registreres månedligt osv. Se i https://download.bls.gov/pub/time.series/cu/cu.txt for at få flere oplysninger.

series_title string 8,336 Alcoholic beverages in U.S. city average, all urban consumers, not seasonally adjusted
Transportation in Los Angeles-Long Beach-Anaheim, CA, all urban consumers, not seasonally adjusted

Serienavn for det tilsvarende series_id. Se serie-id’er og -navne i https://download.bls.gov/pub/time.series/cu/cu.series.

value float 310,603 100.0
101.0999984741211

Prisindeks for vare.

year int 25 2018
2017

Identificerer observationsåret.

Select your preferred service:

Azure Notebooks

Azure Databricks

Azure Synapse

Azure Notebooks

Package: Language: Python Python
In [2]:
# This is a package in preview.
from azureml.opendatasets import UsLaborCPI

usLaborCPI = UsLaborCPI()
usLaborCPI_df = usLaborCPI.to_pandas_dataframe()
ActivityStarted, to_pandas_dataframe
ActivityStarted, to_pandas_dataframe_in_worker
Looking for parquet files...
Reading them into Pandas dataframe...
Reading cpi/part-00000-tid-8289857611821412231-4ef1bca9-6386-4e12-8c7a-31d3ff5d4bc7-3154-1-c000.snappy.parquet under container laborstatisticscontainer
Done.
ActivityCompleted: Activity=to_pandas_dataframe_in_worker, HowEnded=Success, Duration=29342.59 [ms]
ActivityCompleted: Activity=to_pandas_dataframe, HowEnded=Success, Duration=29374.5 [ms]
In [3]:
usLaborCPI_df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 11624937 entries, 0 to 11624936
Data columns (total 12 columns):
area_code      object
item_code      object
series_id      object
year        int32
period       object
value        float32
footnote_codes   object
seasonal      object
periodicity_code  object
series_title    object
item_name      object
area_name      object
dtypes: float32(1), int32(1), object(10)
memory usage: 975.6+ MB
In [1]:
# Pip install packages
import os, sys

!{sys.executable} -m pip install azure-storage-blob
!{sys.executable} -m pip install pyarrow
!{sys.executable} -m pip install pandas
In [2]:
# Azure storage access info
azure_storage_account_name = "azureopendatastorage"
azure_storage_sas_token = r""
container_name = "laborstatisticscontainer"
folder_name = "cpi/"
In [3]:
from azure.storage.blob import BlockBlobServicefrom azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient

if azure_storage_account_name is None or azure_storage_sas_token is None:
  raise Exception(
    "Provide your specific name and key for your Azure Storage account--see the Prerequisites section earlier.")

print('Looking for the first parquet under the folder ' +
   folder_name + ' in container "' + container_name + '"...')
container_url = f"https://{azure_storage_account_name}.blob.core.windows.net/"
blob_service_client = BlobServiceClient(
  container_url, azure_storage_sas_token if azure_storage_sas_token else None)

container_client = blob_service_client.get_container_client(container_name)
blobs = container_client.list_blobs(folder_name)
sorted_blobs = sorted(list(blobs), key=lambda e: e.name, reverse=True)
targetBlobName = ''
for blob in sorted_blobs:
  if blob.name.startswith(folder_name) and blob.name.endswith('.parquet'):
    targetBlobName = blob.name
    break

print('Target blob to download: ' + targetBlobName)
_, filename = os.path.split(targetBlobName)
blob_client = container_client.get_blob_client(targetBlobName)
with open(filename, 'wb') as local_file:
  blob_client.download_blob().download_to_stream(local_file)
In [4]:
# Read the parquet file into Pandas data frame
import pandas as pd

print('Reading the parquet file into Pandas data frame')
df = pd.read_parquet(filename)
In [5]:
# you can add your filter at below
print('Loaded as a Pandas data frame: ')
df
In [6]:
 

Azure Databricks

Package: Language: Python Python
In [1]:
# This is a package in preview.
from azureml.opendatasets import UsLaborCPI

usLaborCPI = UsLaborCPI()
usLaborCPI_df = usLaborCPI.to_spark_dataframe()
ActivityStarted, to_spark_dataframe ActivityStarted, to_spark_dataframe_in_worker ActivityCompleted: Activity=to_spark_dataframe_in_worker, HowEnded=Success, Duration=3007.07 [ms] ActivityCompleted: Activity=to_spark_dataframe, HowEnded=Success, Duration=3011.43 [ms]
In [2]:
display(usLaborCPI_df.limit(5))
area_codeitem_codeseries_idyearperiodvaluefootnote_codesseasonalperiodicity_codeseries_titleitem_namearea_name
S49ESEHF01CWURS49ESEHF01 2017M12279.976nanURElectricity in San Diego-Carlsbad, CA, urban wage earners and clerical workers, not seasonally adjustedElectricitySan Diego-Carlsbad, CA
S49ESEHF01CWURS49ESEHF01 2017M12279.976nanURElectricity in San Diego-Carlsbad, CA, urban wage earners and clerical workers, not seasonally adjustedElectricitySan Diego-Carlsbad, CA
S49ESEHF01CWURS49ESEHF01 2017M12279.976nanURElectricity in San Diego-Carlsbad, CA, urban wage earners and clerical workers, not seasonally adjustedElectricitySan Diego-Carlsbad, CA
S49ESEHF01CWURS49ESEHF01 2017M12279.976nanURElectricity in San Diego-Carlsbad, CA, urban wage earners and clerical workers, not seasonally adjustedElectricitySan Diego-Carlsbad, CA
S49ESEHF01CWURS49ESEHF01 2017M12279.976nanURElectricity in San Diego-Carlsbad, CA, urban wage earners and clerical workers, not seasonally adjustedElectricitySan Diego-Carlsbad, CA
In [1]:
# Azure storage access info
blob_account_name = "azureopendatastorage"
blob_container_name = "laborstatisticscontainer"
blob_relative_path = "cpi/"
blob_sas_token = r""
In [2]:
# Allow SPARK to read from Blob remotely
wasbs_path = 'wasbs://%s@%s.blob.core.windows.net/%s' % (blob_container_name, blob_account_name, blob_relative_path)
spark.conf.set(
 'fs.azure.sas.%s.%s.blob.core.windows.net' % (blob_container_name, blob_account_name),
 blob_sas_token)
print('Remote blob path: ' + wasbs_path)
In [3]:
# SPARK read parquet, note that it won't load any data yet by now
df = spark.read.parquet(wasbs_path)
print('Register the DataFrame as a SQL temporary view: source')
df.createOrReplaceTempView('source')
In [4]:
# Display top 10 rows
print('Displaying top 10 rows: ')
display(spark.sql('SELECT * FROM source LIMIT 10'))

Azure Synapse

Package: Language: Python
In [1]:
# Azure storage access info
blob_account_name = "azureopendatastorage"
blob_container_name = "laborstatisticscontainer"
blob_relative_path = "cpi/"
blob_sas_token = r""
In [2]:
# Allow SPARK to read from Blob remotely
wasbs_path = 'wasbs://%s@%s.blob.core.windows.net/%s' % (blob_container_name, blob_account_name, blob_relative_path)
spark.conf.set(
 'fs.azure.sas.%s.%s.blob.core.windows.net' % (blob_container_name, blob_account_name),
 blob_sas_token)
print('Remote blob path: ' + wasbs_path)
In [3]:
# SPARK read parquet, note that it won't load any data yet by now
df = spark.read.parquet(wasbs_path)
print('Register the DataFrame as a SQL temporary view: source')
df.createOrReplaceTempView('source')
In [4]:
# Display top 10 rows
print('Displaying top 10 rows: ')
display(spark.sql('SELECT * FROM source LIMIT 10'))