Navigatie overslaan

Boston Safety Data

Boston 311 CRM Case Management City Services Public Safety

311-oproepen die aan de stad Boston worden gemeld.

Volg deze koppeling voor meer informatie over BOS:311.

Volume en retentie

Deze gegevensset wordt opgeslagen in de Parquet-indeling. De gegevensset wordt dagelijks bijgewerkt en bevat sinds 2019 in totaal ongeveer 100.000 rijen (10 MB).

Deze gegevensset bevat historische records die vanaf 2011 tot heden zijn verzameld. U kunt in onze SDK gebruikmaken van parameterinstellingen om gegevens op te halen binnen een specifiek tijdsbereik.

Opslaglocatie

Deze gegevensset wordt opgeslagen in de Azure-regio US - oost. Het wordt aanbevolen om rekenresources in US - oost toe te wijzen voor affiniteit.

Aanvullende informatie

Deze gegevensset is afkomstig van het stadsbestuur van Boston. U vindt hier meer informatie. Raadpleeg Open Data Commons Public Domain Dedication and License (ODC PDDL) voor de licentie voor het gebruik van deze gegevensset.

Mededelingen

AZURE OPEN GEGEVENSSETS WORDEN DOOR MICROSOFT ONGEWIJZIGD GELEVERD. MICROSOFT GEEFT GEEN GARANTIES, EXPLICIET OF IMPLICIET, ZEKERHEDEN OF VOORWAARDEN MET BETREKKING TOT HET GEBRUIK VAN DE GEGEVENSSETS. VOOR ZOVER IS TOEGESTAAN ONDER HET TOEPASSELIJKE RECHT, WIJST MICROSOFT ALLE AANSPRAKELIJKHEID AF VOOR SCHADE OF VERLIEZEN, WAARONDER GEVOLGSCHADE OF DIRECTE, SPECIALE, INDIRECTE, INCIDENTELE OF PUNITIEVE SCHADE DIE VOORTVLOEIT UIT HET GEBRUIK VAN DE GEGEVENSSETS.

Deze gegevensset wordt geleverd onder de oorspronkelijke voorwaarden dat Microsoft de brongegevens heeft ontvangen. De gegevensset kan gegevens bevatten die afkomstig zijn van Microsoft.

Access

Available inWhen to use
Azure Notebooks

Quickly explore the dataset with Jupyter notebooks hosted on Azure or your local machine.

Azure Databricks

Use this when you need the scale of an Azure managed Spark cluster to process the dataset.

Azure Synapse

Use this when you need the scale of an Azure managed Spark cluster to process the dataset.

Preview

dataType dataSubtype dateTime category subcategory status address latitude longitude source extendedProperties
Safety 311_All 1/4/2021 11:35:57 PM Enforcement & Abandoned Vehicles Parking Enforcement Open 124 Boston St Dorchester MA 02125 42.3253 -71.059 Citizens Connect App
Safety 311_All 1/4/2021 11:32:16 PM Street Lights Street Light Outages Open 69 Coleman St Dorchester MA 02125 42.3078 -71.0671 Citizens Connect App
Safety 311_All 1/4/2021 11:24:08 PM Graffiti Graffiti Removal Open 161 Allston St Allston MA 02134 42.3481 -71.1381 Self Service
Safety 311_All 1/4/2021 11:18:30 PM Highway Maintenance Request for Pothole Repair Open 173 Alford St Charlestown MA 02129 42.3906 -71.0702 Citizens Connect App
Safety 311_All 1/4/2021 11:13:00 PM Noise Disturbance Automotive Noise Disturbance Open 122 Falcon St East Boston MA 02128 42.3821 -71.0342 Constituent Call
Safety 311_All 1/4/2021 11:06:01 PM Enforcement & Abandoned Vehicles Parking Enforcement Closed INTERSECTION of Brimmer St & Otis Pl Boston MA 42.3594 -71.0587 Citizens Connect App
Safety 311_All 1/4/2021 10:59:37 PM Code Enforcement Poor Conditions of Property Open INTERSECTION of Blue Hill Ave & Dewey St Roxbury MA 42.3594 -71.0587 Citizens Connect App
Safety 311_All 1/4/2021 10:58:11 PM Enforcement & Abandoned Vehicles Abandoned Vehicles Open 944 Dorchester Ave Dorchester MA 02125 42.3184 -71.0563 Citizens Connect App
Safety 311_All 1/4/2021 10:50:36 PM Street Cleaning Requests for Street Cleaning Open 154-158 Meridian St East Boston MA 02128 42.3741 -71.0392 Citizens Connect App
Safety 311_All 1/4/2021 10:50:28 PM Code Enforcement Poor Conditions of Property Open INTERSECTION of Blue Hill Ave & Dewey St Roxbury MA 42.3594 -71.0587 Citizens Connect App
Name Data type Unique Values (sample) Description
address string 138,798 \" \"
1 City Hall Plz Boston MA 02108

Locatie.

category string 54 Street Cleaning
Sanitation

Reden van de serviceaanvraag.

dataSubtype string 1 311_All

“311_All”

dataType string 1 Safety

‘Safety’

dateTime timestamp 1,471,504 2015-07-23 10:51:00
2015-07-23 10:47:00

Datum en tijdstip waarop de serviceaanvraag is geopend.

latitude double 1,622 42.3594
42.3603

Dit is de waarde voor de breedtegraad. Breedtegraden lopen parallel aan de evenaar.

longitude double 1,806 -71.0587
-71.0583

Dit is de waarde voor de lengtegraad. Lengtegraden staan haaks op breedtegraden en doorkruisen beide polen.

source string 7 Constituent Call
Citizens Connect App

Oorspronkelijke bron van de case.

status string 2 Closed
Open

Status van de case.

subcategory string 208 Parking Enforcement
Requests for Street Cleaning

Het type van de serviceaanvraag.

Select your preferred service:

Azure Notebooks

Azure Databricks

Azure Synapse

Azure Notebooks

Package: Language: Python Python
In [1]:
# This is a package in preview.
from azureml.opendatasets import BostonSafety

from datetime import datetime
from dateutil import parser


end_date = parser.parse('2016-01-01')
start_date = parser.parse('2015-05-01')
safety = BostonSafety(start_date=start_date, end_date=end_date)
safety = safety.to_pandas_dataframe()
ActivityStarted, to_pandas_dataframe ActivityStarted, to_pandas_dataframe_in_worker Looking for parquet files... Reading them into Pandas dataframe... Reading Safety/Release/city=Boston/part-00196-tid-845600952581210110-a4f62588-4996-42d1-bc79-23a9b4635c63-447039.c000.snappy.parquet under container citydatacontainer Done. ActivityCompleted: Activity=to_pandas_dataframe_in_worker, HowEnded=Success, Duration=2213.69 [ms] ActivityCompleted: Activity=to_pandas_dataframe, HowEnded=Success, Duration=2216.01 [ms]
In [2]:
safety.info()
<class 'pandas.core.frame.DataFrame'> Int64Index: 1 entries, 56262 to 56262 Data columns (total 11 columns): dataType 1 non-null object dataSubtype 1 non-null object dateTime 1 non-null datetime64[ns] category 1 non-null object subcategory 1 non-null object status 1 non-null object address 1 non-null object latitude 1 non-null float64 longitude 1 non-null float64 source 1 non-null object extendedProperties 0 non-null object dtypes: datetime64[ns](1), float64(2), object(8) memory usage: 96.0+ bytes
In [1]:
# Pip install packages
import os, sys

!{sys.executable} -m pip install azure-storage-blob
!{sys.executable} -m pip install pyarrow
!{sys.executable} -m pip install pandas
In [2]:
# Azure storage access info
azure_storage_account_name = "azureopendatastorage"
azure_storage_sas_token = r""
container_name = "citydatacontainer"
folder_name = "Safety/Release/city=Boston"
In [3]:
from azure.storage.blob import BlockBlobServicefrom azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient

if azure_storage_account_name is None or azure_storage_sas_token is None:
    raise Exception(
        "Provide your specific name and key for your Azure Storage account--see the Prerequisites section earlier.")

print('Looking for the first parquet under the folder ' +
      folder_name + ' in container "' + container_name + '"...')
container_url = f"https://{azure_storage_account_name}.blob.core.windows.net/"
blob_service_client = BlobServiceClient(
    container_url, azure_storage_sas_token if azure_storage_sas_token else None)

container_client = blob_service_client.get_container_client(container_name)
blobs = container_client.list_blobs(folder_name)
sorted_blobs = sorted(list(blobs), key=lambda e: e.name, reverse=True)
targetBlobName = ''
for blob in sorted_blobs:
    if blob.name.startswith(folder_name) and blob.name.endswith('.parquet'):
        targetBlobName = blob.name
        break

print('Target blob to download: ' + targetBlobName)
_, filename = os.path.split(targetBlobName)
blob_client = container_client.get_blob_client(targetBlobName)
with open(filename, 'wb') as local_file:
    blob_client.download_blob().download_to_stream(local_file)
In [4]:
# Read the parquet file into Pandas data frame
import pandas as pd

print('Reading the parquet file into Pandas data frame')
df = pd.read_parquet(filename)
In [5]:
# you can add your filter at below
print('Loaded as a Pandas data frame: ')
df
In [6]:
 

Azure Databricks

Package: Language: Python Python
In [1]:
# This is a package in preview.
# You need to pip install azureml-opendatasets in Databricks cluster. https://docs.microsoft.com/en-us/azure/data-explorer/connect-from-databricks#install-the-python-library-on-your-azure-databricks-cluster
from azureml.opendatasets import BostonSafety

from datetime import datetime
from dateutil import parser


end_date = parser.parse('2016-01-01')
start_date = parser.parse('2015-05-01')
safety = BostonSafety(start_date=start_date, end_date=end_date)
safety = safety.to_spark_dataframe()
ActivityStarted, to_spark_dataframe ActivityStarted, to_spark_dataframe_in_worker ActivityCompleted: Activity=to_spark_dataframe_in_worker, HowEnded=Success, Duration=2380.02 [ms] ActivityCompleted: Activity=to_spark_dataframe, HowEnded=Success, Duration=2381.75 [ms]
In [2]:
display(safety)
dataTypedataSubtypedateTimecategorysubcategorystatusaddresslatitudelongitudesourceextendedProperties
Safety311_All2015-07-24T12:48:24.000+0000Call InquiryOCR Front Desk InteractionsClosed 42.3594-71.0587Constituent Callnull
In [1]:
# Azure storage access info
blob_account_name = "azureopendatastorage"
blob_container_name = "citydatacontainer"
blob_relative_path = "Safety/Release/city=Boston"
blob_sas_token = r""
In [2]:
# Allow SPARK to read from Blob remotely
wasbs_path = 'wasbs://%s@%s.blob.core.windows.net/%s' % (blob_container_name, blob_account_name, blob_relative_path)
spark.conf.set(
  'fs.azure.sas.%s.%s.blob.core.windows.net' % (blob_container_name, blob_account_name),
  blob_sas_token)
print('Remote blob path: ' + wasbs_path)
In [3]:
# SPARK read parquet, note that it won't load any data yet by now
df = spark.read.parquet(wasbs_path)
print('Register the DataFrame as a SQL temporary view: source')
df.createOrReplaceTempView('source')
In [4]:
# Display top 10 rows
print('Displaying top 10 rows: ')
display(spark.sql('SELECT * FROM source LIMIT 10'))

Azure Synapse

Package: Language: Python Python
In [1]:
from azureml.opendatasets import BostonSafety

from datetime import datetime
from dateutil import parser


end_date = parser.parse('2016-01-01')
start_date = parser.parse('2015-05-01')
safety = BostonSafety(start_date=start_date, end_date=end_date)
safety = safety.to_spark_dataframe()
ActivityStarted, to_spark_dataframe ActivityStarted, to_spark_dataframe_in_worker ActivityCompleted: Activity=to_spark_dataframe_in_worker, HowEnded=Success, Duration=2380.02 [ms] ActivityCompleted: Activity=to_spark_dataframe, HowEnded=Success, Duration=2381.75 [ms]
In [2]:
display(safety)
dataTypedataSubtypedateTimecategorysubcategorystatusaddresslatitudelongitudesourceextendedProperties
Safety311_All2015-07-24T12:48:24.000+0000Call InquiryOCR Front Desk InteractionsClosed 42.3594-71.0587Constituent Callnull
In [1]:
# Azure storage access info
blob_account_name = "azureopendatastorage"
blob_container_name = "citydatacontainer"
blob_relative_path = "Safety/Release/city=Boston"
blob_sas_token = r""
In [2]:
# Allow SPARK to read from Blob remotely
wasbs_path = 'wasbs://%s@%s.blob.core.windows.net/%s' % (blob_container_name, blob_account_name, blob_relative_path)
spark.conf.set(
  'fs.azure.sas.%s.%s.blob.core.windows.net' % (blob_container_name, blob_account_name),
  blob_sas_token)
print('Remote blob path: ' + wasbs_path)
In [3]:
# SPARK read parquet, note that it won't load any data yet by now
df = spark.read.parquet(wasbs_path)
print('Register the DataFrame as a SQL temporary view: source')
df.createOrReplaceTempView('source')
In [4]:
# Display top 10 rows
print('Displaying top 10 rows: ')
display(spark.sql('SELECT * FROM source LIMIT 10'))

City Safety

From the Urban Innovation Initiative at Microsoft Research, databricks notebook for analytics with safety data (311 and 911 call data) from major U.S. cities. Analyses show frequency distributions and geographic clustering of safety issues within cities.