Ignora esplorazione

Public Holidays

Public Holidays

Dati per le festività mondiali generati dai pacchetti PyPI relativi alle festività e da Wikipedia e relativi a 38 paesi o aree geografiche dal 1970 al 2099.

Ogni riga contiene informazioni sulle festività per una data, un paese o un’area specifica e indica se per la maggior parte delle persone è previsto il congedo retribuito.

Volume e conservazione

Il set di dati è archiviato nel formato Parquet. Si tratta di uno snapshot con informazioni sulle festività dal 1970-01-01 al 2099-01-01. Le dimensioni dei dati sono pari a circa 500 KB.

Posizione di archiviazione

Questo set di dati è archiviato nell’area Stati Uniti orientali di Azure. L’allocazione delle risorse di calcolo nell’area Stati Uniti orientali è consigliata per motivi di affinità.

Informazioni aggiuntive

Questo set di dati combina dati originati da Wikipedia (WikiMedia Foundation Inc) e dai pacchetti PyPI relativi alle festività.

Il set di dati combinato viene fornito in base alla licenza Creative Commons Attribution-ShareAlike 3.0 Unported.

Invia un messaggio di posta elettronica a se hai domande sull’origine dati.

Notifiche

MICROSOFT FORNISCE I SET DI DATI APERTI DI AZURE “COSÌ COME SONO”. MICROSOFT NON OFFRE ALCUNA GARANZIA O CONDIZIONE ESPLICITA O IMPLICITA RELATIVAMENTE ALL’USO DEI SET DI DATI DA PARTE DELL’UTENTE. NELLA MISURA MASSIMA CONSENTITA DALLE LEGGI LOCALI, MICROSOFT NON RICONOSCE ALCUNA RESPONSABILITÀ RELATIVAMENTE A DANNI O PERDITE COMMERCIALI, INCLUSI I DANNI DIRETTI, CONSEQUENZIALI, SPECIALI, INDIRETTI, INCIDENTALI O PUNITIVI DERIVANTI DALL’USO DEI SET DI DATI DA PARTE DELL’UTENTE.

Questo set di dati viene fornito in conformità con le condizioni originali in base alle quali Microsoft ha ricevuto i dati di origine. Il set di dati potrebbe includere dati provenienti da Microsoft.

Access

Available inWhen to use
Azure Notebooks

Quickly explore the dataset with Jupyter notebooks hosted on Azure or your local machine.

Azure Databricks

Use this when you need the scale of an Azure managed Spark cluster to process the dataset.

Azure Synapse

Use this when you need the scale of an Azure managed Spark cluster to process the dataset.

Preview

countryOrRegion holidayName normalizeHolidayName countryRegionCode date
Norway Søndag Søndag NO 12/28/2098 12:00:00 AM
Sweden Söndag Söndag SE 12/28/2098 12:00:00 AM
Australia Boxing Day Boxing Day AU 12/26/2098 12:00:00 AM
Hungary Karácsony másnapja Karácsony másnapja HU 12/26/2098 12:00:00 AM
Austria Stefanitag Stefanitag AT 12/26/2098 12:00:00 AM
Canada Boxing Day Boxing Day CA 12/26/2098 12:00:00 AM
Croatia Sveti Stjepan Sveti Stjepan HR 12/26/2098 12:00:00 AM
Czech 2. svátek vánoční 2. svátek vánoční CZ 12/26/2098 12:00:00 AM
Denmark Anden juledag Anden juledag DK 12/26/2098 12:00:00 AM
England Boxing Day Boxing Day null 12/26/2098 12:00:00 AM
Name Data type Unique Values (sample) Description
countryOrRegion string 38 Sweden
Norway

Nome completo del paese o area geografica.

countryRegionCode string 35 SE
NO

Codice del paese o dell’area geografica in base al formato disponibile qui.

date timestamp 20,665 2037-01-01 00:00:00
2032-01-01 00:00:00

Data della festività.

holidayName string 483 Søndag
Söndag

Nome completo della festività.

isPaidTimeOff boolean 3 True

Indica se la maggior parte delle persone ottiene congedi retribuiti in questa data. È attualmente disponibile solo per gli Stati Uniti, il Regno Unito e l’India. Se il valore è NULL, l’informazione non è disponibile.

normalizeHolidayName string 438 Søndag
Söndag

Nome normalizzato della festività.

Select your preferred service:

Azure Notebooks

Azure Databricks

Azure Synapse

Azure Notebooks

Package: Language: Python Python
In [1]:
# This is a package in preview.
from azureml.opendatasets import PublicHolidays

from datetime import datetime
from dateutil import parser
from dateutil.relativedelta import relativedelta


end_date = datetime.today()
start_date = datetime.today() - relativedelta(months=1)
hol = PublicHolidays(start_date=start_date, end_date=end_date)
hol_df = hol.to_pandas_dataframe()
ActivityStarted, to_pandas_dataframe ActivityStarted, to_pandas_dataframe_in_worker Looking for parquet files... Reading them into Pandas dataframe... Reading Processed/part-00000-tid-8575944798531137721-7b2fbd47-2ae5-45fd-b8b5-daa663d33177-649-c000.snappy.parquet under container holidaydatacontainer Done. ActivityCompleted: Activity=to_pandas_dataframe_in_worker, HowEnded=Success, Duration=955.3 [ms] ActivityCompleted: Activity=to_pandas_dataframe, HowEnded=Success, Duration=958.23 [ms]
In [2]:
hol_df.info()
<class 'pandas.core.frame.DataFrame'> Int64Index: 34 entries, 25706 to 25739 Data columns (total 6 columns): countryOrRegion 34 non-null object holidayName 34 non-null object normalizeHolidayName 34 non-null object isPaidTimeOff 1 non-null object countryRegionCode 34 non-null object date 34 non-null datetime64[ns] dtypes: datetime64[ns](1), object(5) memory usage: 1.9+ KB
In [1]:
# Pip install packages
import os, sys

!{sys.executable} -m pip install azure-storage-blob
!{sys.executable} -m pip install pyarrow
!{sys.executable} -m pip install pandas
In [2]:
# Azure storage access info
azure_storage_account_name = "azureopendatastorage"
azure_storage_sas_token = r""
container_name = "holidaydatacontainer"
folder_name = "Processed"
In [3]:
from azure.storage.blob import BlockBlobServicefrom azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient

if azure_storage_account_name is None or azure_storage_sas_token is None:
    raise Exception(
        "Provide your specific name and key for your Azure Storage account--see the Prerequisites section earlier.")

print('Looking for the first parquet under the folder ' +
      folder_name + ' in container "' + container_name + '"...')
container_url = f"https://{azure_storage_account_name}.blob.core.windows.net/"
blob_service_client = BlobServiceClient(
    container_url, azure_storage_sas_token if azure_storage_sas_token else None)

container_client = blob_service_client.get_container_client(container_name)
blobs = container_client.list_blobs(folder_name)
sorted_blobs = sorted(list(blobs), key=lambda e: e.name, reverse=True)
targetBlobName = ''
for blob in sorted_blobs:
    if blob.name.startswith(folder_name) and blob.name.endswith('.parquet'):
        targetBlobName = blob.name
        break

print('Target blob to download: ' + targetBlobName)
_, filename = os.path.split(targetBlobName)
blob_client = container_client.get_blob_client(targetBlobName)
with open(filename, 'wb') as local_file:
    blob_client.download_blob().download_to_stream(local_file)
In [4]:
# Read the parquet file into Pandas data frame
import pandas as pd

print('Reading the parquet file into Pandas data frame')
df = pd.read_parquet(filename)
In [5]:
# you can add your filter at below
print('Loaded as a Pandas data frame: ')
df
In [6]:
 

Azure Databricks

Package: Language: Python Python
In [1]:
# This is a package in preview.
# You need to pip install azureml-opendatasets in Databricks cluster. https://docs.microsoft.com/en-us/azure/data-explorer/connect-from-databricks#install-the-python-library-on-your-azure-databricks-cluster
from azureml.opendatasets import PublicHolidays

from datetime import datetime
from dateutil import parser
from dateutil.relativedelta import relativedelta


end_date = datetime.today()
start_date = datetime.today() - relativedelta(months=1)
hol = PublicHolidays(start_date=start_date, end_date=end_date)
hol_df = hol.to_spark_dataframe()
ActivityStarted, to_spark_dataframe ActivityStarted, to_spark_dataframe_in_worker ActivityCompleted: Activity=to_spark_dataframe_in_worker, HowEnded=Success, Duration=2221.62 [ms] ActivityCompleted: Activity=to_spark_dataframe, HowEnded=Success, Duration=2223.36 [ms]
In [2]:
display(hol_df.limit(5))
countryOrRegionholidayNamenormalizeHolidayNameisPaidTimeOffcountryRegionCodedate
NorwaySøndagSøndagnullNO2019-06-16T00:00:00.000+0000
South AfricaYouth DayYouth DaynullZA2019-06-16T00:00:00.000+0000
SwedenSöndagSöndagnullSE2019-06-16T00:00:00.000+0000
UkraineТрійцяТрійцяnullUA2019-06-16T00:00:00.000+0000
ArgentinaDía Pase a la Inmortalidad del General Martín Miguel de Güemes [Day Pass to the Immortality of General Martín Miguel de Güemes]Día Pase a la Inmortalidad del General Martín Miguel de Güemes [Day Pass to the Immortality of General Martín Miguel de Güemes]nullAR2019-06-17T00:00:00.000+0000
In [1]:
# Azure storage access info
blob_account_name = "azureopendatastorage"
blob_container_name = "holidaydatacontainer"
blob_relative_path = "Processed"
blob_sas_token = r""
In [2]:
# Allow SPARK to read from Blob remotely
wasbs_path = 'wasbs://%s@%s.blob.core.windows.net/%s' % (blob_container_name, blob_account_name, blob_relative_path)
spark.conf.set(
  'fs.azure.sas.%s.%s.blob.core.windows.net' % (blob_container_name, blob_account_name),
  blob_sas_token)
print('Remote blob path: ' + wasbs_path)
In [3]:
# SPARK read parquet, note that it won't load any data yet by now
df = spark.read.parquet(wasbs_path)
print('Register the DataFrame as a SQL temporary view: source')
df.createOrReplaceTempView('source')
In [4]:
# Display top 10 rows
print('Displaying top 10 rows: ')
display(spark.sql('SELECT * FROM source LIMIT 10'))

Azure Synapse

Package: Language: Python Python
In [33]:
# This is a package in preview.
from azureml.opendatasets import PublicHolidays

from datetime import datetime
from dateutil import parser
from dateutil.relativedelta import relativedelta


end_date = datetime.today()
start_date = datetime.today() - relativedelta(months=1)
hol = PublicHolidays(start_date=start_date, end_date=end_date)
hol_df = hol.to_spark_dataframe()
In [34]:
# Display top 5 rows
display(hol_df.limit(5))
Out[34]:
In [1]:
# Azure storage access info
blob_account_name = "azureopendatastorage"
blob_container_name = "holidaydatacontainer"
blob_relative_path = "Processed"
blob_sas_token = r""
In [2]:
# Allow SPARK to read from Blob remotely
wasbs_path = 'wasbs://%s@%s.blob.core.windows.net/%s' % (blob_container_name, blob_account_name, blob_relative_path)
spark.conf.set(
  'fs.azure.sas.%s.%s.blob.core.windows.net' % (blob_container_name, blob_account_name),
  blob_sas_token)
print('Remote blob path: ' + wasbs_path)
In [3]:
# SPARK read parquet, note that it won't load any data yet by now
df = spark.read.parquet(wasbs_path)
print('Register the DataFrame as a SQL temporary view: source')
df.createOrReplaceTempView('source')
In [4]:
# Display top 10 rows
print('Displaying top 10 rows: ')
display(spark.sql('SELECT * FROM source LIMIT 10'))