Ignorar Navegação

US State Employment Hours and Earnings

labor statistics employment hours earnings state

O programa Current Employment Statistics (CES, Estatísticas de Emprego Atuais) produz estimativas detalhadas dos vários setores de atividade, excluindo trabalhadores agrícolas, empregados de organizações sem fins lucrativos e trabalhadores domésticos, relativas a emprego, horário e rendimentos dos trabalhadores por conta de outrem nos Estados Unidos.

Este conjunto de dados foi obtido a partir dos dados de Emprego, Horas e Rendimentos de Estados e Áreas Metropolitanas publicados pelo US Bureau of Labor Statistics (BLS, Instituto de Estatísticas de Emprego dos EUA). Reveja Linking and Copyright Information (Informações de Ligações e de Direitos de Autor) e Important Web Site Notices (Avisos Importantes do Site) para obter os termos e condições relativos à utilização deste conjunto de dados.

Localização do Armazenamento

Este conjunto de dados é armazenado na região do Azure E.U.A. Leste. A alocação de recursos de computação nos E.U.A. Leste é recomendada por questões de afinidade.

Conjuntos de Dados Relacionados

Avisos

A MICROSOFT DISPONIBILIZA OS CONJUNTOS DE DADOS ABERTOS DO AZURE TAL COMO ESTÃO. A MICROSOFT NÃO FAZ GARANTIAS, EXPRESSAS OU IMPLÍCITAS, NEM CONDIÇÕES RELATIVAMENTE À SUA UTILIZAÇÃO DOS CONJUNTOS DE DADOS. ATÉ AO LIMITE MÁXIMO PERMITIDO PELA LEGISLAÇÃO LOCAL, A MICROSOFT REJEITA QUALQUER RESPONSABILIDADE POR DANOS OU PERDAS, INCLUINDO DIRETOS, CONSEQUENCIAIS, ESPECIAIS, INDIRETOS, INCIDENTAIS OU PUNITIVOS, QUE RESULTEM DA SUA UTILIZAÇÃO DOS CONJUNTOS DE DADOS.

Este conjunto de dados é disponibilizado de acordo com os termos originais em que a Microsoft recebeu os dados de origem. O conjunto de dados pode incluir dados obtidos junto da Microsoft.

Access

Available inWhen to use
Azure Notebooks

Quickly explore the dataset with Jupyter notebooks hosted on Azure or your local machine.

Azure Databricks

Use this when you need the scale of an Azure managed Spark cluster to process the dataset.

Azure Synapse

Use this when you need the scale of an Azure managed Spark cluster to process the dataset.

Preview

area_code state_code data_type_code industry_code supersector_code series_id year period value footnote_codes seasonal supersector_name industry_name data_type_text state_name area_name
13460 41 26 0 0 SMS41134600000000026 1990 M04 0.2 nan S Total Nonfarm Total Nonfarm All Employees, 3-month average change, In Thousands, seasonally adjusted Oregon Bend-Redmond, OR
13460 41 26 0 0 SMS41134600000000026 1990 M05 0.2 nan S Total Nonfarm Total Nonfarm All Employees, 3-month average change, In Thousands, seasonally adjusted Oregon Bend-Redmond, OR
13460 41 26 0 0 SMS41134600000000026 1990 M06 0.1 nan S Total Nonfarm Total Nonfarm All Employees, 3-month average change, In Thousands, seasonally adjusted Oregon Bend-Redmond, OR
13460 41 26 0 0 SMS41134600000000026 1990 M07 0.1 nan S Total Nonfarm Total Nonfarm All Employees, 3-month average change, In Thousands, seasonally adjusted Oregon Bend-Redmond, OR
13460 41 26 0 0 SMS41134600000000026 1990 M08 0.2 nan S Total Nonfarm Total Nonfarm All Employees, 3-month average change, In Thousands, seasonally adjusted Oregon Bend-Redmond, OR
13460 41 26 0 0 SMS41134600000000026 1990 M09 0.2 nan S Total Nonfarm Total Nonfarm All Employees, 3-month average change, In Thousands, seasonally adjusted Oregon Bend-Redmond, OR
13460 41 26 0 0 SMS41134600000000026 1990 M10 0.1 nan S Total Nonfarm Total Nonfarm All Employees, 3-month average change, In Thousands, seasonally adjusted Oregon Bend-Redmond, OR
13460 41 26 0 0 SMS41134600000000026 1990 M11 0.1 nan S Total Nonfarm Total Nonfarm All Employees, 3-month average change, In Thousands, seasonally adjusted Oregon Bend-Redmond, OR
13460 41 26 0 0 SMS41134600000000026 1990 M12 0.2 nan S Total Nonfarm Total Nonfarm All Employees, 3-month average change, In Thousands, seasonally adjusted Oregon Bend-Redmond, OR
13460 41 26 0 0 SMS41134600000000026 1991 M01 0.1 nan S Total Nonfarm Total Nonfarm All Employees, 3-month average change, In Thousands, seasonally adjusted Oregon Bend-Redmond, OR
Name Data type Unique Values (sample) Description
area_code string 446 0
31084
area_name string 446 Statewide
Los Angeles-Long Beach-Glendale, CA Metropolitan Division
data_type_code string 9 1
2
data_type_text string 9 All Employees, In Thousands
Average Weekly Earnings of All Employees, In Dollars
footnote_codes string 3 nan
P
industry_code string 343 0
5000000
industry_name string 343 Total Nonfarm
Total Private
period string 13 M04
M05
seasonal string 2 U
S
series_id string 23,853 SMU36000000000000001
SMU41000000000000001
state_code string 53 6
48
state_name string 53 California
Texas
supersector_code string 22 90
60
supersector_name string 22 Government
Professional and Business Services
value float 132,565 0.30000001192092896
0.10000000149011612
year int 81 2014
2018

Select your preferred service:

Azure Notebooks

Azure Databricks

Azure Synapse

Azure Notebooks

Package: Language: Python Python
In [2]:
# This is a package in preview.
from azureml.opendatasets import UsLaborEHEState

usLaborEHEState = UsLaborEHEState()
usLaborEHEState_df = usLaborEHEState.to_pandas_dataframe()
ActivityStarted, to_pandas_dataframe
ActivityStarted, to_pandas_dataframe_in_worker
Looking for parquet files...
Reading them into Pandas dataframe...
Reading ehe_state/part-00000-tid-4584292525148687084-3a752bd7-c4a5-4be5-a63b-8293d5b83692-8930-1-c000.snappy.parquet under container laborstatisticscontainer
Done.
ActivityCompleted: Activity=to_pandas_dataframe_in_worker, HowEnded=Success, Duration=24771.26 [ms]
ActivityCompleted: Activity=to_pandas_dataframe, HowEnded=Success, Duration=24804.03 [ms]
In [3]:
usLaborEHEState_df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8065684 entries, 0 to 8065683
Data columns (total 16 columns):
area_code           object
state_code          object
data_type_code      object
industry_code       object
supersector_code    object
series_id           object
year                int32
period              object
value               float32
footnote_codes      object
seasonal            object
supersector_name    object
industry_name       object
data_type_text      object
state_name          object
area_name           object
dtypes: float32(1), int32(1), object(14)
memory usage: 923.0+ MB
In [1]:
# Pip install packages
import os, sys

!{sys.executable} -m pip install azure-storage-blob
!{sys.executable} -m pip install pyarrow
!{sys.executable} -m pip install pandas
In [2]:
# Azure storage access info
azure_storage_account_name = "azureopendatastorage"
azure_storage_sas_token = r""
container_name = "laborstatisticscontainer"
folder_name = "ehe_state/"
In [3]:
from azure.storage.blob import BlockBlobServicefrom azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient

if azure_storage_account_name is None or azure_storage_sas_token is None:
    raise Exception(
        "Provide your specific name and key for your Azure Storage account--see the Prerequisites section earlier.")

print('Looking for the first parquet under the folder ' +
      folder_name + ' in container "' + container_name + '"...')
container_url = f"https://{azure_storage_account_name}.blob.core.windows.net/"
blob_service_client = BlobServiceClient(
    container_url, azure_storage_sas_token if azure_storage_sas_token else None)

container_client = blob_service_client.get_container_client(container_name)
blobs = container_client.list_blobs(folder_name)
sorted_blobs = sorted(list(blobs), key=lambda e: e.name, reverse=True)
targetBlobName = ''
for blob in sorted_blobs:
    if blob.name.startswith(folder_name) and blob.name.endswith('.parquet'):
        targetBlobName = blob.name
        break

print('Target blob to download: ' + targetBlobName)
_, filename = os.path.split(targetBlobName)
blob_client = container_client.get_blob_client(targetBlobName)
with open(filename, 'wb') as local_file:
    blob_client.download_blob().download_to_stream(local_file)
In [4]:
# Read the parquet file into Pandas data frame
import pandas as pd

print('Reading the parquet file into Pandas data frame')
df = pd.read_parquet(filename)
In [5]:
# you can add your filter at below
print('Loaded as a Pandas data frame: ')
df
In [6]:
 

Azure Databricks

Package: Language: Python Python
In [1]:
# This is a package in preview.
from azureml.opendatasets import UsLaborEHEState

usLaborEHEState = UsLaborEHEState()
usLaborEHEState_df = usLaborEHEState.to_spark_dataframe()
Failure while loading azureml_run_type_providers. Failed to load entrypoint azureml.scriptrun = azureml.core.script_run:ScriptRun._from_run_dto with exception (six 1.10.0 (/usr/lib/python3/dist-packages), Requirement.parse('six>=1.11.0')). ActivityStarted, to_spark_dataframe ActivityStarted, to_spark_dataframe_in_worker ActivityCompleted: Activity=to_spark_dataframe_in_worker, HowEnded=Success, Duration=2921.23 [ms] ActivityCompleted: Activity=to_spark_dataframe, HowEnded=Success, Duration=2924.96 [ms]
In [2]:
display(usLaborEHEState_df.limit(5))
area_codestate_codedata_type_codeindustry_codesupersector_codeseries_idyearperiodvaluefootnote_codesseasonalsupersector_nameindustry_namedata_type_textstate_namearea_name
13460412600SMS41134600000000026 1990M040.2nanSTotal NonfarmTotal NonfarmAll Employees, 3-month average change, In Thousands, seasonally adjustedOregonBend-Redmond, OR
13460412600SMS41134600000000026 1990M050.2nanSTotal NonfarmTotal NonfarmAll Employees, 3-month average change, In Thousands, seasonally adjustedOregonBend-Redmond, OR
13460412600SMS41134600000000026 1990M060.1nanSTotal NonfarmTotal NonfarmAll Employees, 3-month average change, In Thousands, seasonally adjustedOregonBend-Redmond, OR
13460412600SMS41134600000000026 1990M070.1nanSTotal NonfarmTotal NonfarmAll Employees, 3-month average change, In Thousands, seasonally adjustedOregonBend-Redmond, OR
13460412600SMS41134600000000026 1990M080.2nanSTotal NonfarmTotal NonfarmAll Employees, 3-month average change, In Thousands, seasonally adjustedOregonBend-Redmond, OR
In [1]:
# Azure storage access info
blob_account_name = "azureopendatastorage"
blob_container_name = "laborstatisticscontainer"
blob_relative_path = "ehe_state/"
blob_sas_token = r""
In [2]:
# Allow SPARK to read from Blob remotely
wasbs_path = 'wasbs://%s@%s.blob.core.windows.net/%s' % (blob_container_name, blob_account_name, blob_relative_path)
spark.conf.set(
  'fs.azure.sas.%s.%s.blob.core.windows.net' % (blob_container_name, blob_account_name),
  blob_sas_token)
print('Remote blob path: ' + wasbs_path)
In [3]:
# SPARK read parquet, note that it won't load any data yet by now
df = spark.read.parquet(wasbs_path)
print('Register the DataFrame as a SQL temporary view: source')
df.createOrReplaceTempView('source')
In [4]:
# Display top 10 rows
print('Displaying top 10 rows: ')
display(spark.sql('SELECT * FROM source LIMIT 10'))

Azure Synapse

Package: Language: Python
In [1]:
# Azure storage access info
blob_account_name = "azureopendatastorage"
blob_container_name = "laborstatisticscontainer"
blob_relative_path = "ehe_state/"
blob_sas_token = r""
In [2]:
# Allow SPARK to read from Blob remotely
wasbs_path = 'wasbs://%s@%s.blob.core.windows.net/%s' % (blob_container_name, blob_account_name, blob_relative_path)
spark.conf.set(
  'fs.azure.sas.%s.%s.blob.core.windows.net' % (blob_container_name, blob_account_name),
  blob_sas_token)
print('Remote blob path: ' + wasbs_path)
In [3]:
# SPARK read parquet, note that it won't load any data yet by now
df = spark.read.parquet(wasbs_path)
print('Register the DataFrame as a SQL temporary view: source')
df.createOrReplaceTempView('source')
In [4]:
# Display top 10 rows
print('Displaying top 10 rows: ')
display(spark.sql('SELECT * FROM source LIMIT 10'))