ナビゲーションをスキップする

US National Employment Hours and Earnings

labor statistics employment hours earnings national

Current Employment Statistics (CES) プログラムでは、米国の給与支払い名簿を基に、非農業部門雇用者数、労働時間、賃金の詳細な業界推定値を生成します。

このデータセットの詳細情報が含まれた README ファイルは、データセットの元の場所で入手できます。

このデータセットは、米国労働統計局 (BLS) によって公開されている 「Current Employment Statistics - CES (National)」のデータをソースとしています。 このデータセットの使用に関する諸条件については、「Linking and Copyright Information (リンクおよび著作権情報)」と「Important Web Site Notices (Web サイトに関する重要な通知)」を確認してください。

保存先

このデータセットは、米国東部 Azure リージョンに保存されています。 アフィニティのために、米国東部でコンピューティング リソースを割り当てることをお勧めします。

関連データセット

通知

Microsoft は、Azure オープン データセットを “現状有姿” で提供します。 Microsoft は、データセットの使用に関して、明示または黙示を問わず、いかなる保証も行わないものとし、条件を定めることもありません。 現地の法律の下で認められている範囲内で、Microsoft は、データセットの使用に起因する、直接的、派生的、特別、間接的、偶発的、または懲罰的なものを含めたいかなる損害または損失に対しても一切の責任を負わないものとします。

このデータセットは、Microsoft がソース データを受け取った元の条件に基づいて提供されます。 データセットには、Microsoft が提供するデータが含まれている場合があります。

Access

Available inWhen to use
Azure Notebooks

Quickly explore the dataset with Jupyter notebooks hosted on Azure or your local machine.

Azure Databricks

Use this when you need the scale of an Azure managed Spark cluster to process the dataset.

Azure Synapse

Use this when you need the scale of an Azure managed Spark cluster to process the dataset.

Preview

data_type_code industry_code supersector_code series_id year period value footnote_codes seasonal series_title supersector_name industry_name data_type_text
26 5000000 5 CES0500000026 1939 M04 52 nan S All employees, 3-month average change, seasonally adjusted, thousands, total private, seasonally adjusted Total private Total private ALL EMPLOYEES, 3-MONTH AVERAGE CHANGE, SEASONALLY ADJUSTED, THOUSANDS
26 5000000 5 CES0500000026 1939 M05 65 nan S All employees, 3-month average change, seasonally adjusted, thousands, total private, seasonally adjusted Total private Total private ALL EMPLOYEES, 3-MONTH AVERAGE CHANGE, SEASONALLY ADJUSTED, THOUSANDS
26 5000000 5 CES0500000026 1939 M06 74 nan S All employees, 3-month average change, seasonally adjusted, thousands, total private, seasonally adjusted Total private Total private ALL EMPLOYEES, 3-MONTH AVERAGE CHANGE, SEASONALLY ADJUSTED, THOUSANDS
26 5000000 5 CES0500000026 1939 M07 103 nan S All employees, 3-month average change, seasonally adjusted, thousands, total private, seasonally adjusted Total private Total private ALL EMPLOYEES, 3-MONTH AVERAGE CHANGE, SEASONALLY ADJUSTED, THOUSANDS
26 5000000 5 CES0500000026 1939 M08 108 nan S All employees, 3-month average change, seasonally adjusted, thousands, total private, seasonally adjusted Total private Total private ALL EMPLOYEES, 3-MONTH AVERAGE CHANGE, SEASONALLY ADJUSTED, THOUSANDS
26 5000000 5 CES0500000026 1939 M09 152 nan S All employees, 3-month average change, seasonally adjusted, thousands, total private, seasonally adjusted Total private Total private ALL EMPLOYEES, 3-MONTH AVERAGE CHANGE, SEASONALLY ADJUSTED, THOUSANDS
26 5000000 5 CES0500000026 1939 M10 307 nan S All employees, 3-month average change, seasonally adjusted, thousands, total private, seasonally adjusted Total private Total private ALL EMPLOYEES, 3-MONTH AVERAGE CHANGE, SEASONALLY ADJUSTED, THOUSANDS
26 5000000 5 CES0500000026 1939 M11 248 nan S All employees, 3-month average change, seasonally adjusted, thousands, total private, seasonally adjusted Total private Total private ALL EMPLOYEES, 3-MONTH AVERAGE CHANGE, SEASONALLY ADJUSTED, THOUSANDS
26 5000000 5 CES0500000026 1939 M12 151 nan S All employees, 3-month average change, seasonally adjusted, thousands, total private, seasonally adjusted Total private Total private ALL EMPLOYEES, 3-MONTH AVERAGE CHANGE, SEASONALLY ADJUSTED, THOUSANDS
26 5000000 5 CES0500000026 1940 M01 44 nan S All employees, 3-month average change, seasonally adjusted, thousands, total private, seasonally adjusted Total private Total private ALL EMPLOYEES, 3-MONTH AVERAGE CHANGE, SEASONALLY ADJUSTED, THOUSANDS
Name Data type Unique Values (sample) Description
data_type_code string 37 1
10

「https://download.bls.gov/pub/time.series/ce/ce.datatype」を参照してください。

data_type_text string 37 ALL EMPLOYEES, THOUSANDS
WOMEN EMPLOYEES, THOUSANDS

https://download.bls.gov/pub/time.series/ce/ce.datatype をご覧ください

footnote_codes string 2 nan
P
industry_code string 902 30000000
32000000

含まれる異業種。 「https://download.bls.gov/pub/time.series/ce/ce.industry」を参照してください。

industry_name string 895 Durable goods
Nondurable goods

含まれる異業種。 「https://download.bls.gov/pub/time.series/ce/ce.industry」を参照してください。

period string 13 M03
M06

「https://download.bls.gov/pub/time.series/ce/ce.period」を参照してください。

seasonal string 2 U
S
series_id string 26,021 CEU9091000001
CEU3000000034

異なる種類のデータ系列がデータセットで入手できます。 「https://download.bls.gov/pub/time.series/ce/ce.series」を参照してください。

series_title string 25,685 All employees, thousands, durable goods, not seasonally adjusted
All employees, thousands, nondurable goods, not seasonally adjusted

異なる種類のデータ系列のタイトルがデータセットで入手できます。 「https://download.bls.gov/pub/time.series/ce/ce.series」を参照してください。

supersector_code string 22 31
60

より上位レベルの業界またはセクターの分類。 「https://download.bls.gov/pub/time.series/ce/ce.supersector」を参照してください。

supersector_name string 22 Durable Goods
Professional and business services

より上位レベルの業界またはセクターの分類。 「https://download.bls.gov/pub/time.series/ce/ce.supersector」を参照してください。

value float 572,372 38.5
38.400001525878906
year int 81 2017
2013

Select your preferred service:

Azure Notebooks

Azure Databricks

Azure Synapse

Azure Notebooks

Package: Language: Python Python
In [1]:
# This is a package in preview.
from azureml.opendatasets import UsLaborEHENational

usLaborEHENational = UsLaborEHENational()
usLaborEHENational_df = usLaborEHENational.to_pandas_dataframe()
ActivityStarted, to_pandas_dataframe
ActivityStarted, to_pandas_dataframe_in_worker
Looking for parquet files...
Reading them into Pandas dataframe...
Reading ehe_national/part-00000-tid-148006372733218319-122ceb1f-08a6-4430-acc4-afa8feb00295-6944-1-c000.snappy.parquet under container laborstatisticscontainer
Done.
ActivityCompleted: Activity=to_pandas_dataframe_in_worker, HowEnded=Success, Duration=35903.59 [ms]
ActivityCompleted: Activity=to_pandas_dataframe, HowEnded=Success, Duration=35927.93 [ms]
In [2]:
usLaborEHENational_df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 7616400 entries, 0 to 7616399
Data columns (total 13 columns):
data_type_code      object
industry_code       object
supersector_code    object
series_id           object
year                int32
period              object
value               float32
footnote_codes      object
seasonal            object
series_title        object
supersector_name    object
industry_name       object
data_type_text      object
dtypes: float32(1), int32(1), object(11)
memory usage: 697.3+ MB
In [1]:
# Pip install packages
import os, sys

!{sys.executable} -m pip install azure-storage-blob
!{sys.executable} -m pip install pyarrow
!{sys.executable} -m pip install pandas
In [2]:
# Azure storage access info
azure_storage_account_name = "azureopendatastorage"
azure_storage_sas_token = r""
container_name = "laborstatisticscontainer"
folder_name = "ehe_national/"
In [3]:
from azure.storage.blob import BlockBlobServicefrom azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient

if azure_storage_account_name is None or azure_storage_sas_token is None:
    raise Exception(
        "Provide your specific name and key for your Azure Storage account--see the Prerequisites section earlier.")

print('Looking for the first parquet under the folder ' +
      folder_name + ' in container "' + container_name + '"...')
container_url = f"https://{azure_storage_account_name}.blob.core.windows.net/"
blob_service_client = BlobServiceClient(
    container_url, azure_storage_sas_token if azure_storage_sas_token else None)

container_client = blob_service_client.get_container_client(container_name)
blobs = container_client.list_blobs(folder_name)
sorted_blobs = sorted(list(blobs), key=lambda e: e.name, reverse=True)
targetBlobName = ''
for blob in sorted_blobs:
    if blob.name.startswith(folder_name) and blob.name.endswith('.parquet'):
        targetBlobName = blob.name
        break

print('Target blob to download: ' + targetBlobName)
_, filename = os.path.split(targetBlobName)
blob_client = container_client.get_blob_client(targetBlobName)
with open(filename, 'wb') as local_file:
    blob_client.download_blob().download_to_stream(local_file)
In [4]:
# Read the parquet file into Pandas data frame
import pandas as pd

print('Reading the parquet file into Pandas data frame')
df = pd.read_parquet(filename)
In [5]:
# you can add your filter at below
print('Loaded as a Pandas data frame: ')
df
In [6]:
 

Azure Databricks

Package: Language: Python Python
In [1]:
# This is a package in preview.
from azureml.opendatasets import UsLaborEHENational

usLaborEHENational = UsLaborEHENational()
usLaborEHENational_df = usLaborEHENational.to_spark_dataframe()
Failure while loading azureml_run_type_providers. Failed to load entrypoint azureml.scriptrun = azureml.core.script_run:ScriptRun._from_run_dto with exception (six 1.10.0 (/usr/lib/python3/dist-packages), Requirement.parse('six>=1.11.0')). ActivityStarted, to_spark_dataframe ActivityStarted, to_spark_dataframe_in_worker ActivityCompleted: Activity=to_spark_dataframe_in_worker, HowEnded=Success, Duration=4288.24 [ms] ActivityCompleted: Activity=to_spark_dataframe, HowEnded=Success, Duration=4292.09 [ms]
In [2]:
display(usLaborEHENational_df.limit(5))
data_type_codeindustry_codesupersector_codeseries_idyearperiodvaluefootnote_codesseasonalseries_titlesupersector_nameindustry_namedata_type_text
2600CES0000000026 1939M0457.0nanSAll employees, 3-month average change, seasonally adjusted, thousands, total nonfarm, seasonally adjustedTotal nonfarmTotal nonfarmALL EMPLOYEES, 3-MONTH AVERAGE CHANGE, SEASONALLY ADJUSTED, THOUSANDS
2600CES0000000026 1939M0566.0nanSAll employees, 3-month average change, seasonally adjusted, thousands, total nonfarm, seasonally adjustedTotal nonfarmTotal nonfarmALL EMPLOYEES, 3-MONTH AVERAGE CHANGE, SEASONALLY ADJUSTED, THOUSANDS
2600CES0000000026 1939M0674.0nanSAll employees, 3-month average change, seasonally adjusted, thousands, total nonfarm, seasonally adjustedTotal nonfarmTotal nonfarmALL EMPLOYEES, 3-MONTH AVERAGE CHANGE, SEASONALLY ADJUSTED, THOUSANDS
2600CES0000000026 1939M07108.0nanSAll employees, 3-month average change, seasonally adjusted, thousands, total nonfarm, seasonally adjustedTotal nonfarmTotal nonfarmALL EMPLOYEES, 3-MONTH AVERAGE CHANGE, SEASONALLY ADJUSTED, THOUSANDS
2600CES0000000026 1939M08121.0nanSAll employees, 3-month average change, seasonally adjusted, thousands, total nonfarm, seasonally adjustedTotal nonfarmTotal nonfarmALL EMPLOYEES, 3-MONTH AVERAGE CHANGE, SEASONALLY ADJUSTED, THOUSANDS
In [1]:
# Azure storage access info
blob_account_name = "azureopendatastorage"
blob_container_name = "laborstatisticscontainer"
blob_relative_path = "ehe_national/"
blob_sas_token = r""
In [2]:
# Allow SPARK to read from Blob remotely
wasbs_path = 'wasbs://%s@%s.blob.core.windows.net/%s' % (blob_container_name, blob_account_name, blob_relative_path)
spark.conf.set(
  'fs.azure.sas.%s.%s.blob.core.windows.net' % (blob_container_name, blob_account_name),
  blob_sas_token)
print('Remote blob path: ' + wasbs_path)
In [3]:
# SPARK read parquet, note that it won't load any data yet by now
df = spark.read.parquet(wasbs_path)
print('Register the DataFrame as a SQL temporary view: source')
df.createOrReplaceTempView('source')
In [4]:
# Display top 10 rows
print('Displaying top 10 rows: ')
display(spark.sql('SELECT * FROM source LIMIT 10'))

Azure Synapse

Package: Language: Python
In [1]:
# Azure storage access info
blob_account_name = "azureopendatastorage"
blob_container_name = "laborstatisticscontainer"
blob_relative_path = "ehe_national/"
blob_sas_token = r""
In [2]:
# Allow SPARK to read from Blob remotely
wasbs_path = 'wasbs://%s@%s.blob.core.windows.net/%s' % (blob_container_name, blob_account_name, blob_relative_path)
spark.conf.set(
  'fs.azure.sas.%s.%s.blob.core.windows.net' % (blob_container_name, blob_account_name),
  blob_sas_token)
print('Remote blob path: ' + wasbs_path)
In [3]:
# SPARK read parquet, note that it won't load any data yet by now
df = spark.read.parquet(wasbs_path)
print('Register the DataFrame as a SQL temporary view: source')
df.createOrReplaceTempView('source')
In [4]:
# Display top 10 rows
print('Displaying top 10 rows: ')
display(spark.sql('SELECT * FROM source LIMIT 10'))