跳过导航

Boston Safety Data

Boston 311 CRM Case Management City Services Public Safety

向波士顿市报告了 311 次呼叫。

请参阅此链接以详细了解 BOS:311

数量和保留期

此数据集以 Parquet 格式存储。 它每天更新一次,截至 2019 年总共包含约 10 万行 (10 MB)。

此数据集包含从 2011 年至今累积的历史记录。 可使用我们的 SDK 中的参数设置来提取特定时间范围内的数据。

存储位置

此数据集存储在美国东部 Azure 区域。 建议将计算资源分配到美国东部地区,以实现相关性。

其他信息

此数据集来自波士顿市政府。 有关更多详细信息,请参阅此处。 有关使用此数据集的许可,请参阅开放数据共享公共域奉献与许可 (ODC PDDL)

声明

Microsoft 以“原样”为基础提供 AZURE 开放数据集。 Microsoft 对数据集的使用不提供任何担保(明示或暗示)、保证或条件。 在当地法律允许的范围内,Microsoft 对使用数据集而导致的任何损害或损失不承担任何责任,包括直接、必然、特殊、间接、偶发或惩罚。

此数据集是根据 Microsoft 接收源数据的原始条款提供的。 数据集可能包含来自 Microsoft 的数据。

Access

Available inWhen to use
Azure Notebooks

Quickly explore the dataset with Jupyter notebooks hosted on Azure or your local machine.

Azure Databricks

Use this when you need the scale of an Azure managed Spark cluster to process the dataset.

Azure Synapse

Use this when you need the scale of an Azure managed Spark cluster to process the dataset.

Preview

dataType dataSubtype dateTime category subcategory status address latitude longitude source extendedProperties
Safety 311_All 5/12/2020 11:38:02 PM Enforcement & Abandoned Vehicles Abandoned Vehicles Open 34 Chestnut Sq Jamaica Plain MA 02130 42.3149 -71.1069 Citizens Connect App
Safety 311_All 5/12/2020 11:33:49 PM Code Enforcement Improper Storage of Trash (Barrels) Open 65 O'Reilly Way Charlestown MA 02129 42.3783 -71.0591 Citizens Connect App
Safety 311_All 5/12/2020 10:43:08 PM Signs & Signals Traffic Signal Inspection Open INTERSECTION of Morton St & Norfolk St Dorchester MA 42.3594 -71.0587 Citizens Connect App
Safety 311_All 5/12/2020 10:41:00 PM Recycling Sticker Request Open 13 Brookfield St Roslindale MA 02131 42.2879 -71.1328 Constituent Call
Safety 311_All 5/12/2020 10:21:23 PM Street Cleaning Pick up Dead Animal Open 1-5 Woolsey Sq Jamaica Plain MA 02130 42.3106 -71.1077 Citizens Connect App
Safety 311_All 5/12/2020 10:11:00 PM Sanitation Schedule a Bulk Item Pickup Open 12 Drury Rd Hyde Park MA 02136 42.251 -71.1341 Constituent Call
Safety 311_All 5/12/2020 10:09:37 PM Code Enforcement Improper Storage of Trash (Barrels) Open 37 N Bennet St Boston MA 02113 42.3657 -71.0547 Citizens Connect App
Safety 311_All 5/12/2020 10:08:33 PM Signs & Signals Sign Repair Open INTERSECTION of Weld St & Maple St West Roxbury MA 42.3594 -71.0587 Citizens Connect App
Safety 311_All 5/12/2020 9:44:56 PM Enforcement & Abandoned Vehicles Parking Enforcement Open 14 Shelby St East Boston MA 02128 42.3809 -71.0277 Citizens Connect App
Safety 311_All 5/12/2020 9:44:26 PM Enforcement & Abandoned Vehicles Parking Enforcement Closed 48 Kingsdale St Dorchester MA 02124 42.2947 -71.0814 Citizens Connect App
Name Data type Unique Values (sample) Description
address string 139,608 \" \"
1 City Hall Plz Boston MA 02108

位置。

category string 54 Street Cleaning
Sanitation

服务请求原因。

dataSubtype string 1 311_All

“311_All”

dataType string 1 Safety

“Safety”

dateTime timestamp 1,540,380 2015-07-23 10:51:00
2015-07-23 10:47:00

服务请求的公开日期和时间。

latitude double 1,622 42.3594
42.3603

这是纬度值。 纬线平行于赤道。

longitude double 1,806 -71.0587
-71.0583

这是经度值。 经度线垂直于纬度线,并且都穿过两个极点。

source string 7 Constituent Call
Citizens Connect App

案件的原始来源。

status string 2 Closed
Open

事件状态。

subcategory string 208 Parking Enforcement
Requests for Street Cleaning

服务请求类型。

Select your preferred service:

Azure Notebooks

Azure Databricks

Azure Synapse

Azure Notebooks

Package: Language: Python Python
In [1]:
# This is a package in preview.
from azureml.opendatasets import BostonSafety

from datetime import datetime
from dateutil import parser


end_date = parser.parse('2016-01-01')
start_date = parser.parse('2015-05-01')
safety = BostonSafety(start_date=start_date, end_date=end_date)
safety = safety.to_pandas_dataframe()
ActivityStarted, to_pandas_dataframe ActivityStarted, to_pandas_dataframe_in_worker Looking for parquet files... Reading them into Pandas dataframe... Reading Safety/Release/city=Boston/part-00196-tid-845600952581210110-a4f62588-4996-42d1-bc79-23a9b4635c63-447039.c000.snappy.parquet under container citydatacontainer Done. ActivityCompleted: Activity=to_pandas_dataframe_in_worker, HowEnded=Success, Duration=2213.69 [ms] ActivityCompleted: Activity=to_pandas_dataframe, HowEnded=Success, Duration=2216.01 [ms]
In [2]:
safety.info()
<class 'pandas.core.frame.DataFrame'> Int64Index: 1 entries, 56262 to 56262 Data columns (total 11 columns): dataType 1 non-null object dataSubtype 1 non-null object dateTime 1 non-null datetime64[ns] category 1 non-null object subcategory 1 non-null object status 1 non-null object address 1 non-null object latitude 1 non-null float64 longitude 1 non-null float64 source 1 non-null object extendedProperties 0 non-null object dtypes: datetime64[ns](1), float64(2), object(8) memory usage: 96.0+ bytes
In [1]:
# Pip install packages
import os, sys

!{sys.executable} -m pip install azure-storage
!{sys.executable} -m pip install pyarrow
!{sys.executable} -m pip install pandas
In [2]:
# Azure storage access info
azure_storage_account_name = "azureopendatastorage"
azure_storage_sas_token = r""
container_name = "citydatacontainer"
folder_name = "Safety/Release/city=Boston"
In [3]:
from azure.storage.blob import BlockBlobService

if azure_storage_account_name is None or azure_storage_sas_token is None:
    raise Exception("Provide your specific name and key for your Azure Storage account--see the Prerequisites section earlier.")

print('Looking for the first parquet under the folder ' + folder_name + ' in container "' + container_name + '"...')
blob_service = BlockBlobService(account_name = azure_storage_account_name, sas_token = azure_storage_sas_token,)
blobs = blob_service.list_blobs(container_name)
sorted_blobs = sorted(list(blobs), key=lambda e: e.name, reverse=True)
targetBlobName=''
for blob in sorted_blobs:
    if blob.name.startswith(folder_name) and blob.name.endswith('.parquet'):
        targetBlobName = blob.name
        break

print('Target blob to download: ' + targetBlobName)
_, filename = os.path.split(targetBlobName)
parquet_file=blob_service.get_blob_to_path(container_name, targetBlobName, filename)
In [4]:
# Read the local parquet file into Pandas data frame
import pyarrow.parquet as pq
import pandas as pd

appended_df = []
print('Reading the local parquet file into Pandas data frame')
df = pq.read_table(filename).to_pandas()
In [5]:
# you can add your filter at below
print('Loaded as a Pandas data frame: ')
df
In [6]:
 

Azure Databricks

Package: Language: Python Python
In [1]:
# This is a package in preview.
# You need to pip install azureml-opendatasets in Databricks cluster. https://docs.microsoft.com/en-us/azure/data-explorer/connect-from-databricks#install-the-python-library-on-your-azure-databricks-cluster
from azureml.opendatasets import BostonSafety

from datetime import datetime
from dateutil import parser


end_date = parser.parse('2016-01-01')
start_date = parser.parse('2015-05-01')
safety = BostonSafety(start_date=start_date, end_date=end_date)
safety = safety.to_spark_dataframe()
ActivityStarted, to_spark_dataframe ActivityStarted, to_spark_dataframe_in_worker ActivityCompleted: Activity=to_spark_dataframe_in_worker, HowEnded=Success, Duration=2380.02 [ms] ActivityCompleted: Activity=to_spark_dataframe, HowEnded=Success, Duration=2381.75 [ms]
In [2]:
display(safety)
dataTypedataSubtypedateTimecategorysubcategorystatusaddresslatitudelongitudesourceextendedProperties
Safety311_All2015-07-24T12:48:24.000+0000Call InquiryOCR Front Desk InteractionsClosed 42.3594-71.0587Constituent Callnull
In [1]:
# Azure storage access info
blob_account_name = "azureopendatastorage"
blob_container_name = "citydatacontainer"
blob_relative_path = "Safety/Release/city=Boston"
blob_sas_token = r""
In [2]:
# Allow SPARK to read from Blob remotely
wasbs_path = 'wasbs://%s@%s.blob.core.windows.net/%s' % (blob_container_name, blob_account_name, blob_relative_path)
spark.conf.set(
  'fs.azure.sas.%s.%s.blob.core.windows.net' % (blob_container_name, blob_account_name),
  blob_sas_token)
print('Remote blob path: ' + wasbs_path)
In [3]:
# SPARK read parquet, note that it won't load any data yet by now
df = spark.read.parquet(wasbs_path)
print('Register the DataFrame as a SQL temporary view: source')
df.createOrReplaceTempView('source')
In [4]:
# Display top 10 rows
print('Displaying top 10 rows: ')
display(spark.sql('SELECT * FROM source LIMIT 10'))

Azure Synapse

Package: Language: Python Python
In [1]:
from azureml.opendatasets import BostonSafety

from datetime import datetime
from dateutil import parser


end_date = parser.parse('2016-01-01')
start_date = parser.parse('2015-05-01')
safety = BostonSafety(start_date=start_date, end_date=end_date)
safety = safety.to_spark_dataframe()
ActivityStarted, to_spark_dataframe ActivityStarted, to_spark_dataframe_in_worker ActivityCompleted: Activity=to_spark_dataframe_in_worker, HowEnded=Success, Duration=2380.02 [ms] ActivityCompleted: Activity=to_spark_dataframe, HowEnded=Success, Duration=2381.75 [ms]
In [2]:
display(safety)
dataTypedataSubtypedateTimecategorysubcategorystatusaddresslatitudelongitudesourceextendedProperties
Safety311_All2015-07-24T12:48:24.000+0000Call InquiryOCR Front Desk InteractionsClosed 42.3594-71.0587Constituent Callnull
In [1]:
# Azure storage access info
blob_account_name = "azureopendatastorage"
blob_container_name = "citydatacontainer"
blob_relative_path = "Safety/Release/city=Boston"
blob_sas_token = r""
In [2]:
# Allow SPARK to read from Blob remotely
wasbs_path = 'wasbs://%s@%s.blob.core.windows.net/%s' % (blob_container_name, blob_account_name, blob_relative_path)
spark.conf.set(
  'fs.azure.sas.%s.%s.blob.core.windows.net' % (blob_container_name, blob_account_name),
  blob_sas_token)
print('Remote blob path: ' + wasbs_path)
In [3]:
# SPARK read parquet, note that it won't load any data yet by now
df = spark.read.parquet(wasbs_path)
print('Register the DataFrame as a SQL temporary view: source')
df.createOrReplaceTempView('source')
In [4]:
# Display top 10 rows
print('Displaying top 10 rows: ')
display(spark.sql('SELECT * FROM source LIMIT 10'))

City Safety

From the Urban Innovation Initiative at Microsoft Research, databricks notebook for analytics with safety data (311 and 911 call data) from major U.S. cities. Analyses show frequency distributions and geographic clustering of safety issues within cities.