ナビゲーションをスキップする

Boston Safety Data

Boston 311 CRM Case Management City Services Public Safety

ボストン市に報告された 311 コール。

BOS:311 の詳細については、こちらのリンク先をご覧ください。

ボリュームとデータ保持期間

このデータセットは Parquet 形式で保存されています。 毎日更新されており、2019 年時点で合計約 10 万行 (10 MB) が含まれています。

このデータセットには、2011 年から現在までに蓄積された過去の記録が含まれます。 SDK でパラメーター設定を使用して、特定の時間範囲内のデータをフェッチできます。

保存先

このデータセットは、米国東部 Azure リージョンに保存されています。 アフィニティのために、米国東部でコンピューティング リソースを割り当てることをお勧めします。

追加情報

このデータセットはボストン市政府から提供されています。 詳細については、こちらをご覧ください。 このデータセットを使用するためのライセンスについては、Open Data Commons Public Domain Dedication and License (ODC PDDL) をご覧ください。

通知

Microsoft は、Azure オープン データセットを “現状有姿” で提供します。 Microsoft は、データセットの使用に関して、明示または黙示を問わず、いかなる保証も行わないものとし、条件を定めることもありません。 現地の法律の下で認められている範囲内で、Microsoft は、データセットの使用に起因する、直接的、派生的、特別、間接的、偶発的、または懲罰的なものを含めたいかなる損害または損失に対しても一切の責任を負わないものとします。

このデータセットは、Microsoft がソース データを受け取った元の条件に基づいて提供されます。 データセットには、Microsoft が提供するデータが含まれている場合があります。

Access

Available inWhen to use
Azure Notebooks

Quickly explore the dataset with Jupyter notebooks hosted on Azure or your local machine.

Azure Databricks

Use this when you need the scale of an Azure managed Spark cluster to process the dataset.

Azure Synapse

Use this when you need the scale of an Azure managed Spark cluster to process the dataset.

Preview

dataType dataSubtype dateTime category subcategory status address latitude longitude source extendedProperties
Safety 311_All 6/8/2021 11:57:00 PM Highway Maintenance Contractor Complaints Open 35 South St Jamaica Plain MA 02130 42.3086 -71.1158 Constituent Call
Safety 311_All 6/8/2021 11:57:00 PM Building Exceeding Terms of Permit Open INTERSECTION of Bardwell St & Sedgwick St Jamaica Plain MA 42.3594 -71.0587 Constituent Call
Safety 311_All 6/8/2021 11:53:38 PM Environmental Services Rodent Activity Open 280 North St Boston MA 02113 42.3639 -71.0521 Citizens Connect App
Safety 311_All 6/8/2021 11:43:10 PM Enforcement & Abandoned Vehicles Parking Enforcement Closed 190 Cornell St Roslindale MA 02131 42.2789 -71.1349 Citizens Connect App
Safety 311_All 6/8/2021 11:30:00 PM Code Enforcement Illegal Dumping Open INTERSECTION of Westcott St & Talbot Ave Dorchester MA 42.3594 -71.0587 Constituent Call
Safety 311_All 6/8/2021 11:28:00 PM Signs & Signals Traffic Signal Inspection Open INTERSECTION of Brighton Ave & Allston St Allston MA 42.3594 -71.0587 Constituent Call
Safety 311_All 6/8/2021 11:20:25 PM Highway Maintenance Request for Pothole Repair Open 38 Waverly St Brighton MA 02135 42.3609 -71.1433 Citizens Connect App
Safety 311_All 6/8/2021 11:16:39 PM Enforcement & Abandoned Vehicles Parking Enforcement Closed 190 Cornell St Roslindale MA 02131 42.2789 -71.1349 Citizens Connect App
Safety 311_All 6/8/2021 11:14:20 PM Highway Maintenance Request for Pothole Repair Open 180 Train St Dorchester MA 02122 42.2893 -71.0521 Citizens Connect App
Safety 311_All 6/8/2021 11:14:07 PM Street Lights Street Light Outages Open 1855 Washington St Roxbury MA 02118 42.3353 -71.0789 Citizens Connect App
Name Data type Unique Values (sample) Description
address string 140,975 \" \"
1 City Hall Plz Boston MA 02108

場所。

category string 54 Street Cleaning
Sanitation

サービス要求の理由。

dataSubtype string 1 311_All

“311_All”

dataType string 1 Safety

“Safety”

dateTime timestamp 1,557,831 2015-07-23 10:51:00
2015-07-23 10:47:00

サービス要求を開いた日時。

latitude double 1,622 42.3594
42.3603

これは緯度値です。 緯線は赤道に平行です。

longitude double 1,807 -71.0587
-71.0583

これは経度値です。 経線は緯線に垂直に走り、すべて両極を通ります。

source string 7 Constituent Call
Citizens Connect App

ケースの元のソース。

status string 2 Closed
Open

ケースの状態。

subcategory string 209 Parking Enforcement
Requests for Street Cleaning

サービス要求の種類。

Select your preferred service:

Azure Notebooks

Azure Databricks

Azure Synapse

Azure Notebooks

Package: Language: Python Python
In [1]:
# This is a package in preview.
from azureml.opendatasets import BostonSafety

from datetime import datetime
from dateutil import parser


end_date = parser.parse('2016-01-01')
start_date = parser.parse('2015-05-01')
safety = BostonSafety(start_date=start_date, end_date=end_date)
safety = safety.to_pandas_dataframe()
ActivityStarted, to_pandas_dataframe ActivityStarted, to_pandas_dataframe_in_worker Looking for parquet files... Reading them into Pandas dataframe... Reading Safety/Release/city=Boston/part-00196-tid-845600952581210110-a4f62588-4996-42d1-bc79-23a9b4635c63-447039.c000.snappy.parquet under container citydatacontainer Done. ActivityCompleted: Activity=to_pandas_dataframe_in_worker, HowEnded=Success, Duration=2213.69 [ms] ActivityCompleted: Activity=to_pandas_dataframe, HowEnded=Success, Duration=2216.01 [ms]
In [2]:
safety.info()
<class 'pandas.core.frame.DataFrame'> Int64Index: 1 entries, 56262 to 56262 Data columns (total 11 columns): dataType 1 non-null object dataSubtype 1 non-null object dateTime 1 non-null datetime64[ns] category 1 non-null object subcategory 1 non-null object status 1 non-null object address 1 non-null object latitude 1 non-null float64 longitude 1 non-null float64 source 1 non-null object extendedProperties 0 non-null object dtypes: datetime64[ns](1), float64(2), object(8) memory usage: 96.0+ bytes
In [1]:
# Pip install packages
import os, sys

!{sys.executable} -m pip install azure-storage-blob
!{sys.executable} -m pip install pyarrow
!{sys.executable} -m pip install pandas
In [2]:
# Azure storage access info
azure_storage_account_name = "azureopendatastorage"
azure_storage_sas_token = r""
container_name = "citydatacontainer"
folder_name = "Safety/Release/city=Boston"
In [3]:
from azure.storage.blob import BlockBlobServicefrom azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient

if azure_storage_account_name is None or azure_storage_sas_token is None:
    raise Exception(
        "Provide your specific name and key for your Azure Storage account--see the Prerequisites section earlier.")

print('Looking for the first parquet under the folder ' +
      folder_name + ' in container "' + container_name + '"...')
container_url = f"https://{azure_storage_account_name}.blob.core.windows.net/"
blob_service_client = BlobServiceClient(
    container_url, azure_storage_sas_token if azure_storage_sas_token else None)

container_client = blob_service_client.get_container_client(container_name)
blobs = container_client.list_blobs(folder_name)
sorted_blobs = sorted(list(blobs), key=lambda e: e.name, reverse=True)
targetBlobName = ''
for blob in sorted_blobs:
    if blob.name.startswith(folder_name) and blob.name.endswith('.parquet'):
        targetBlobName = blob.name
        break

print('Target blob to download: ' + targetBlobName)
_, filename = os.path.split(targetBlobName)
blob_client = container_client.get_blob_client(targetBlobName)
with open(filename, 'wb') as local_file:
    blob_client.download_blob().download_to_stream(local_file)
In [4]:
# Read the parquet file into Pandas data frame
import pandas as pd

print('Reading the parquet file into Pandas data frame')
df = pd.read_parquet(filename)
In [5]:
# you can add your filter at below
print('Loaded as a Pandas data frame: ')
df
In [6]:
 

Azure Databricks

Package: Language: Python Python
In [1]:
# This is a package in preview.
# You need to pip install azureml-opendatasets in Databricks cluster. https://docs.microsoft.com/en-us/azure/data-explorer/connect-from-databricks#install-the-python-library-on-your-azure-databricks-cluster
from azureml.opendatasets import BostonSafety

from datetime import datetime
from dateutil import parser


end_date = parser.parse('2016-01-01')
start_date = parser.parse('2015-05-01')
safety = BostonSafety(start_date=start_date, end_date=end_date)
safety = safety.to_spark_dataframe()
ActivityStarted, to_spark_dataframe ActivityStarted, to_spark_dataframe_in_worker ActivityCompleted: Activity=to_spark_dataframe_in_worker, HowEnded=Success, Duration=2380.02 [ms] ActivityCompleted: Activity=to_spark_dataframe, HowEnded=Success, Duration=2381.75 [ms]
In [2]:
display(safety)
dataTypedataSubtypedateTimecategorysubcategorystatusaddresslatitudelongitudesourceextendedProperties
Safety311_All2015-07-24T12:48:24.000+0000Call InquiryOCR Front Desk InteractionsClosed 42.3594-71.0587Constituent Callnull
In [1]:
# Azure storage access info
blob_account_name = "azureopendatastorage"
blob_container_name = "citydatacontainer"
blob_relative_path = "Safety/Release/city=Boston"
blob_sas_token = r""
In [2]:
# Allow SPARK to read from Blob remotely
wasbs_path = 'wasbs://%s@%s.blob.core.windows.net/%s' % (blob_container_name, blob_account_name, blob_relative_path)
spark.conf.set(
  'fs.azure.sas.%s.%s.blob.core.windows.net' % (blob_container_name, blob_account_name),
  blob_sas_token)
print('Remote blob path: ' + wasbs_path)
In [3]:
# SPARK read parquet, note that it won't load any data yet by now
df = spark.read.parquet(wasbs_path)
print('Register the DataFrame as a SQL temporary view: source')
df.createOrReplaceTempView('source')
In [4]:
# Display top 10 rows
print('Displaying top 10 rows: ')
display(spark.sql('SELECT * FROM source LIMIT 10'))

Azure Synapse

Package: Language: Python Python
In [1]:
from azureml.opendatasets import BostonSafety

from datetime import datetime
from dateutil import parser


end_date = parser.parse('2016-01-01')
start_date = parser.parse('2015-05-01')
safety = BostonSafety(start_date=start_date, end_date=end_date)
safety = safety.to_spark_dataframe()
ActivityStarted, to_spark_dataframe ActivityStarted, to_spark_dataframe_in_worker ActivityCompleted: Activity=to_spark_dataframe_in_worker, HowEnded=Success, Duration=2380.02 [ms] ActivityCompleted: Activity=to_spark_dataframe, HowEnded=Success, Duration=2381.75 [ms]
In [2]:
display(safety)
dataTypedataSubtypedateTimecategorysubcategorystatusaddresslatitudelongitudesourceextendedProperties
Safety311_All2015-07-24T12:48:24.000+0000Call InquiryOCR Front Desk InteractionsClosed 42.3594-71.0587Constituent Callnull
In [1]:
# Azure storage access info
blob_account_name = "azureopendatastorage"
blob_container_name = "citydatacontainer"
blob_relative_path = "Safety/Release/city=Boston"
blob_sas_token = r""
In [2]:
# Allow SPARK to read from Blob remotely
wasbs_path = 'wasbs://%s@%s.blob.core.windows.net/%s' % (blob_container_name, blob_account_name, blob_relative_path)
spark.conf.set(
  'fs.azure.sas.%s.%s.blob.core.windows.net' % (blob_container_name, blob_account_name),
  blob_sas_token)
print('Remote blob path: ' + wasbs_path)
In [3]:
# SPARK read parquet, note that it won't load any data yet by now
df = spark.read.parquet(wasbs_path)
print('Register the DataFrame as a SQL temporary view: source')
df.createOrReplaceTempView('source')
In [4]:
# Display top 10 rows
print('Displaying top 10 rows: ')
display(spark.sql('SELECT * FROM source LIMIT 10'))

City Safety

From the Urban Innovation Initiative at Microsoft Research, databricks notebook for analytics with safety data (311 and 911 call data) from major U.S. cities. Analyses show frequency distributions and geographic clustering of safety issues within cities.