Пропустить навигацию

Boston Safety Data

311 Boston Case Management City Services CRM Public Safety

Описание

This dataset contains information about 311 calls reported to the city of Boston.

Refer to this link to learn more about BOS:311.

Volume and Retention

This dataset is stored in Parquet format. It is updated daily, and contains about 100K rows (10MB) in total as of 2019.

This dataset contains historical records accumulated from 2011 to the present. You can use parameter settings in our SDK to fetch data within a specific time range.

Storage Location

This dataset is stored in the East US Azure region. Allocating compute resources in East US is recommended for affinity.

Notices

MICROSOFT PROVIDES AZURE OPEN DATASETS ON AN “AS IS” BASIS. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, GUARANTEES OR CONDITIONS WITH RESPECT TO YOUR USE OF THE DATASETS. TO THE EXTENT PERMITTED UNDER YOUR LOCAL LAW, MICROSOFT DISCLAIMS ALL LIABILITY FOR ANY DAMAGES OR LOSSES, INCLUDING DIRECT, CONSEQUENTIAL, SPECIAL, INDIRECT, INCIDENTAL OR PUNITIVE, RESULTING FROM YOUR USE OF THE DATASETS.

This dataset is provided under the original terms that Microsoft received source data. The dataset may include data sourced from Microsoft. See below for more information.

This dataset is sourced from city of Boston government. More details can be found from here. Reference Open Data Commons Public Domain Dedication and License (ODC PDDL) for the license of using this dataset.

Доступ

Доступно вСценарии использования
Azure Notebooks

Quickly explore the dataset with Jupyter notebooks hosted on Azure or your local machine.

Azure Databricks

Use this when you need the scale of an Azure managed Spark cluster to process the dataset.

Предварительная версия

dataType dataSubtype dateTime category subcategory status address latitude longitude source
Safety 311_All 7/15/2019 11:07:06 AM Sanitation Schedule a Bulk Item Pickup SS Open 4 Hestia Park Roxbury MA 02119 42.323 -71.0852 Self Service
Safety 311_All 7/15/2019 11:07:00 AM Park Maintenance & Safety Ground Maintenance Open 42.3594 -71.0587 Constituent Call
Safety 311_All 7/15/2019 11:05:59 AM Sanitation Schedule a Bulk Item Pickup SS Open 19 Clancy Rd Dorchester MA 02124 42.2779 -71.0705 Self Service
Safety 311_All 7/15/2019 10:37:00 AM Sanitation Schedule a Bulk Item Pickup Open 51 Brock St Brighton MA 02135 42.3508 -71.1601 Constituent Call
Safety 311_All 7/14/2019 3:51:00 PM Building Exceeding Terms of Permit Open 11 Ruthven St Dorchester MA 02121 42.3594 -71.0587 Constituent Call
Safety 311_All 7/13/2019 8:04:00 PM Sanitation Schedule a Bulk Item Pickup Open 59 Birchcroft Rd Mattapan MA 02126 42.2667 -71.109 Constituent Call
Safety 311_All 7/11/2019 9:13:00 PM Abandoned Bicycle Abandoned Bicycle Open 281 Bremen St East Boston MA 02128 42.3753 -71.0307 Citizens Connect App
Safety 311_All 7/11/2019 7:15:00 PM Animal Issues Animal Generic Request Open 720 Albany St Roxbury MA 02118 42.3347 -71.0714 Constituent Call
Safety 311_All 7/11/2019 10:57:00 AM Sanitation Schedule a Bulk Item Pickup Open 10 Moore St East Boston MA 02128 42.3838 -71.0201 Constituent Call
Safety 311_All 7/11/2019 10:26:00 AM Recycling Request for Recycling Cart Open 32 Woodford St Dorchester MA 02125 42.3183 -71.0708 Employee Generated
Имя Тип данных Уникальные Значения (пример) Описание
address string 60,314 \"\"
1 City Hall Plz Boston MA 02108

Location.

category string 53 Street Cleaning
Sanitation

Reason of the service request.

dataSubtype string 1 311_All

“311_All”

dataType string 1 Safety

“Safety”

dateTime timestamp 123,854 2016-11-10 11:20:00
2017-02-10 11:35:00

Open date and time of the service request.

extendedProperties string 12,380 "CASE_ENQUIRY_ID:101001982307,CASE_TITLE:CE Collection,CLOSURE_REASON:Case Closed. Closed date : 2017-01-05 12:30:59.64 Case Resolved tree picked up ,ClosedPhoto:https://cityworker.cityofboston.gov:8443/attachments/report/586e704f70b4898bb4af8e7c/closed_photo/Report.jpg,Department:PWDx,LOCATION_STREET_NAME:100 Ashmont St,LOCATION_ZIPCODE:02124,OnTime:ONTIME,QUEUE:PWDx_District 07: South Dorchester,SUBJECT:Public Works Department,TARGET_DT:2017-01-06 11:11:59,closed_dt:2017-01-05 12:30:59,city_council_district:4,neighborhood:Dorchester,neighborhood_services_district:8,police_district:C11,precinct:1709,pwd_district:07,ward:Ward 17"
"CASE_ENQUIRY_ID:101002888511,CASE_TITLE:Improper Storage of Trash (Barrels),CLOSURE_REASON:Case Closed. Closed date : 2019-04-24 12:16:20.663 Case Noted No code enforcement violation found at this time ,ClosedPhoto:https://cityworker.cityofboston.gov:8443/attachments/report/5cc044ccd052db0c4b1619ff/closed_photo/Report.jpg,Department:PWDx,LOCATION_STREET_NAME:421 Marlborough St,LOCATION_ZIPCODE:02215,OnTime:ONTIME,QUEUE:PWDx_Code Enforcement,SUBJECT:Public Works Department,TARGET_DT:2019-04-26 08:30:00,closed_dt:2019-04-24 12:16:20,city_council_district:8,neighborhood:Back Bay,neighborhood_services_district:14,police_district:D4,precinct:0509,pwd_district:10A,ward:Ward 5"

Additional fields with “key:value” pair format for each 311 service request.

latitude double 1,606 42.3594
42.3603

This is the latitude value. Lines of latitude are parallel to the equator.

longitude double 1,777 -71.0587
-71.0583

This is the longitude value. Lines of longitude run perpendicular to lines of latitude, and all pass through both poles.

source string 7 Constituent Call
Citizens Connect App

Original source of the case.

status string 2 Closed
Open

Case status.

subcategory string 194 Request for Snow Plowing
Schedule a Bulk Item Pickup

Type of the service request.

Выберите предпочитаемую службу:

Azure Notebooks

Azure Databricks

Azure Notebooks

Пакет: Язык: Python Python
In [1]:
# This is a package in preview.
from azureml.opendatasets import BostonSafety

from datetime import datetime
from dateutil import parser


end_date = parser.parse('2016-01-01')
start_date = parser.parse('2015-05-01')
safety = BostonSafety(start_date=start_date, end_date=end_date)
safety = safety.to_pandas_dataframe()
ActivityStarted, to_pandas_dataframe ActivityStarted, to_pandas_dataframe_in_worker Looking for parquet files... Reading them into Pandas dataframe... Reading Safety/Release/city=Boston/part-00196-tid-845600952581210110-a4f62588-4996-42d1-bc79-23a9b4635c63-447039.c000.snappy.parquet under container citydatacontainer Done. ActivityCompleted: Activity=to_pandas_dataframe_in_worker, HowEnded=Success, Duration=2213.69 [ms] ActivityCompleted: Activity=to_pandas_dataframe, HowEnded=Success, Duration=2216.01 [ms]
In [2]:
safety.info()
<class 'pandas.core.frame.DataFrame'> Int64Index: 1 entries, 56262 to 56262 Data columns (total 11 columns): dataType 1 non-null object dataSubtype 1 non-null object dateTime 1 non-null datetime64[ns] category 1 non-null object subcategory 1 non-null object status 1 non-null object address 1 non-null object latitude 1 non-null float64 longitude 1 non-null float64 source 1 non-null object extendedProperties 0 non-null object dtypes: datetime64[ns](1), float64(2), object(8) memory usage: 96.0+ bytes
# Pip install packages
import os, sys

!{sys.executable} -m pip install azure-storage
!{sys.executable} -m pip install pyarrow
!{sys.executable} -m pip install pandas

# COMMAND ----------

# Azure storage access info
azure_storage_account_name = "azureopendatastorage"
azure_storage_sas_token = r""
container_name = "citydatacontainer"
folder_name = "Safety/Release/city=Boston"

# COMMAND ----------

from azure.storage.blob import BlockBlobService

if azure_storage_account_name is None or azure_storage_sas_token is None:
    raise Exception("Provide your specific name and key for your Azure Storage account--see the Prerequisites section earlier.")

print('Looking for the first parquet under the folder ' + folder_name + ' in container "' + container_name + '"...')
blob_service = BlockBlobService(account_name = azure_storage_account_name, sas_token = azure_storage_sas_token,)
blobs = blob_service.list_blobs(container_name)
sorted_blobs = sorted(list(blobs), key=lambda e: e.name, reverse=True)
targetBlobName=''
for blob in sorted_blobs:
    if blob.name.startswith(folder_name) and blob.name.endswith('.parquet'):
        targetBlobName = blob.name
        break

print('Target blob to download: ' + targetBlobName)
_, filename = os.path.split(targetBlobName)
parquet_file=blob_service.get_blob_to_path(container_name, targetBlobName, filename)

# COMMAND ----------

# Read the local parquet file into Pandas data frame
import pyarrow.parquet as pq
import pandas as pd

appended_df = []
print('Reading the local parquet file into Pandas data frame')
df = pq.read_table(filename).to_pandas()

# COMMAND ----------

# you can add your filter at below
print('Loaded as a Pandas data frame: ')
df

# COMMAND ----------


Azure Databricks

Пакет: Язык: Python Python
In [1]:
# This is a package in preview.
# You need to pip install azureml-opendatasets in Databricks cluster. https://docs.microsoft.com/en-us/azure/data-explorer/connect-from-databricks#install-the-python-library-on-your-azure-databricks-cluster
from azureml.opendatasets import BostonSafety

from datetime import datetime
from dateutil import parser


end_date = parser.parse('2016-01-01')
start_date = parser.parse('2015-05-01')
safety = BostonSafety(start_date=start_date, end_date=end_date)
safety = safety.to_spark_dataframe()
ActivityStarted, to_spark_dataframe ActivityStarted, to_spark_dataframe_in_worker ActivityCompleted: Activity=to_spark_dataframe_in_worker, HowEnded=Success, Duration=2380.02 [ms] ActivityCompleted: Activity=to_spark_dataframe, HowEnded=Success, Duration=2381.75 [ms]
In [2]:
display(safety)
dataTypedataSubtypedateTimecategorysubcategorystatusaddresslatitudelongitudesourceextendedProperties
Safety311_All2015-07-24T12:48:24.000+0000Call InquiryOCR Front Desk InteractionsClosed 42.3594-71.0587Constituent Callnull
# Databricks notebook source
# Azure storage access info
blob_account_name = "azureopendatastorage"
blob_container_name = "citydatacontainer"
blob_relative_path = "Safety/Release/city=Boston"
blob_sas_token = r""

# COMMAND ----------

# Allow SPARK to read from Blob remotely
wasbs_path = 'wasbs://%s@%s.blob.core.windows.net/%s' % (blob_container_name, blob_account_name, blob_relative_path)
spark.conf.set(
  'fs.azure.sas.%s.%s.blob.core.windows.net' % (blob_container_name, blob_account_name),
  blob_sas_token)
print('Remote blob path: ' + wasbs_path)

# COMMAND ----------

# SPARK read parquet, note that it won't load any data yet by now
df = spark.read.parquet(wasbs_path)
print('Register the DataFrame as a SQL temporary view: source')
df.createOrReplaceTempView('source')

# COMMAND ----------

# Display top 10 rows
print('Displaying top 10 rows: ')
display(spark.sql('SELECT * FROM source LIMIT 10'))

City Safety

From the Urban Innovation Initiative at Microsoft Research, databricks notebook for analytics with safety data (311 and 911 call data) from major U.S. cities. Analyses show frequency distributions and geographic clustering of safety issues within cities.