Fire department calls for service and 311 cases in San Francisco.
Fire Calls-For-Service includes all fire unit responses to calls. Each record includes the call number, incident number, address, unit identifier, call type, and disposition. All relevant time intervals are also included. Because this dataset is based on responses, and since most calls involved multiple units, there are multiple records for each call number. Addresses are associated with a block number, intersection or call box, not a specific address.
311 Cases includes cases generally associated with a place or thing (for example parks, streets, or buildings) and created July 1, 2008 or later. Cases generally logged by a user regarding their own needs (for example, property or business tax questions, parking permit requests) are not included. See the Program Link for more information.
Volume and Retention
This dataset is stored in Parquet format. It is updated daily with about 6M rows (400MB) as of 2019.
This dataset contains historical records accumulated from 2015 to the present. You can use parameter settings in our SDK to fetch data within a specific time range.
Storage Location
This dataset is stored in the East US Azure region. Allocating compute resources in East US is recommended for affinity.
Additional Information
This dataset is sourced from city of San Francisco government. More details can be found from the following links: Fire Department Calls, 311 Cases.
Reference here for the terms of using this dataset.
Notices
MICROSOFT PROVIDES AZURE OPEN DATASETS ON AN “AS IS” BASIS. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, GUARANTEES OR CONDITIONS WITH RESPECT TO YOUR USE OF THE DATASETS. TO THE EXTENT PERMITTED UNDER YOUR LOCAL LAW, MICROSOFT DISCLAIMS ALL LIABILITY FOR ANY DAMAGES OR LOSSES, INCLUDING DIRECT, CONSEQUENTIAL, SPECIAL, INDIRECT, INCIDENTAL OR PUNITIVE, RESULTING FROM YOUR USE OF THE DATASETS.
This dataset is provided under the original terms that Microsoft received source data. The dataset may include data sourced from Microsoft.
Access
Available in | When to use |
---|---|
Azure Notebooks | Quickly explore the dataset with Jupyter notebooks hosted on Azure or your local machine. |
Azure Databricks | Use this when you need the scale of an Azure managed Spark cluster to process the dataset. |
Azure Synapse | Use this when you need the scale of an Azure managed Spark cluster to process the dataset. |
Preview
dataType | dataSubtype | dateTime | category | subcategory | status | address | source | extendedProperties |
---|---|---|---|---|---|---|---|---|
Safety | 911_Fire | 1/21/2021 2:59:53 AM | Non Life-threatening | Medical Incident | null | 900 Block of HYDE ST | null | |
Safety | 911_Fire | 1/21/2021 2:55:16 AM | Alarm | Other | null | 17TH ST/CLAYTON ST | null | |
Safety | 911_Fire | 1/21/2021 2:48:02 AM | Potentially Life-Threatening | Medical Incident | null | 6TH ST/STEVENSON ST | null | |
Safety | 911_Fire | 1/21/2021 2:31:55 AM | Potentially Life-Threatening | Medical Incident | null | 17TH ST/ROOSEVELT WY | null | |
Safety | 911_Fire | 1/21/2021 2:28:05 AM | null | Alarms | null | 500 Block of CHURCH ST | null | |
Safety | 911_Fire | 1/21/2021 2:28:05 AM | null | Alarms | null | 500 Block of CHURCH ST | null | |
Safety | 911_Fire | 1/21/2021 2:28:05 AM | null | Alarms | null | 500 Block of CHURCH ST | null | |
Safety | 911_Fire | 1/21/2021 2:20:34 AM | Alarm | Other | null | 400 Block of PARNASSUS AVE | null | |
Safety | 911_Fire | 1/21/2021 2:20:34 AM | Alarm | Other | null | 400 Block of PARNASSUS AVE | null | |
Safety | 911_Fire | 1/21/2021 2:20:34 AM | Alarm | Other | null | 400 Block of PARNASSUS AVE | null |
Name | Data type | Unique | Values (sample) | Description |
---|---|---|---|---|
address | string | 270,913 | Not associated with a specific address 0 Block of 6TH ST |
Address of incident (note: address and location generalized to mid-block of street, intersection or nearest call box location, to protect caller privacy). |
category | string | 108 | Street and Sidewalk Cleaning Potentially Life-Threatening |
The human readable name of the 311 service request type or call type group for 911 fire calls. |
dataSubtype | string | 2 | 911_Fire 311_All |
“911_Fire” or “311_All”. |
dataType | string | 1 | Safety | “Safety” |
dateTime | timestamp | 6,330,062 | 2020-10-19 12:28:08 2020-07-28 06:40:26 |
The date and time when the service request was made or when the fire call was received. |
latitude | double | 1,518,478 | 37.777624238929 37.786117211838 |
Latitude of the location, using the WGS84 projection. |
longitude | double | 1,461,242 | -122.39998111124 -122.419854245692 |
Longitude of the location, using the WGS84 projection. |
source | string | 9 | Phone Mobile/Open311 |
Mechanism or path by which the service request was received; typically “Phone”, “Text/SMS”, “Website”, “Mobile App”, “Twitter”, etc. but terms may vary by system. |
status | string | 3 | Closed Open |
A single-word indicator of the current state of the service request. (Note: GeoReport V2 only permits “open” and “closed”) |
subcategory | string | 1,270 | Medical Incident Bulky Items |
The human readable name of the service request subtype for 311 cases or call type for 911 fire calls. |
Azure Notebooks
# This is a package in preview.
from azureml.opendatasets import SanFranciscoSafety
from datetime import datetime
from dateutil import parser
end_date = parser.parse('2016-01-01')
start_date = parser.parse('2015-05-01')
safety = SanFranciscoSafety(start_date=start_date, end_date=end_date)
safety = safety.to_pandas_dataframe()
safety.info()
# Pip install packages
import os, sys
!{sys.executable} -m pip install azure-storage-blob
!{sys.executable} -m pip install pyarrow
!{sys.executable} -m pip install pandas
# Azure storage access info
azure_storage_account_name = "azureopendatastorage"
azure_storage_sas_token = r""
container_name = "citydatacontainer"
folder_name = "Safety/Release/city=SanFrancisco"
from azure.storage.blob import BlockBlobServicefrom azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient
if azure_storage_account_name is None or azure_storage_sas_token is None:
raise Exception(
"Provide your specific name and key for your Azure Storage account--see the Prerequisites section earlier.")
print('Looking for the first parquet under the folder ' +
folder_name + ' in container "' + container_name + '"...')
container_url = f"https://{azure_storage_account_name}.blob.core.windows.net/"
blob_service_client = BlobServiceClient(
container_url, azure_storage_sas_token if azure_storage_sas_token else None)
container_client = blob_service_client.get_container_client(container_name)
blobs = container_client.list_blobs(folder_name)
sorted_blobs = sorted(list(blobs), key=lambda e: e.name, reverse=True)
targetBlobName = ''
for blob in sorted_blobs:
if blob.name.startswith(folder_name) and blob.name.endswith('.parquet'):
targetBlobName = blob.name
break
print('Target blob to download: ' + targetBlobName)
_, filename = os.path.split(targetBlobName)
blob_client = container_client.get_blob_client(targetBlobName)
with open(filename, 'wb') as local_file:
blob_client.download_blob().download_to_stream(local_file)
# Read the parquet file into Pandas data frame
import pandas as pd
print('Reading the parquet file into Pandas data frame')
df = pd.read_parquet(filename)
# you can add your filter at below
print('Loaded as a Pandas data frame: ')
df
Azure Databricks
# This is a package in preview.
# You need to pip install azureml-opendatasets in Databricks cluster. https://docs.microsoft.com/en-us/azure/data-explorer/connect-from-databricks#install-the-python-library-on-your-azure-databricks-cluster
from azureml.opendatasets import SanFranciscoSafety
from datetime import datetime
from dateutil import parser
end_date = parser.parse('2016-01-01')
start_date = parser.parse('2015-05-01')
safety = SanFranciscoSafety(start_date=start_date, end_date=end_date)
safety = safety.to_spark_dataframe()
display(safety.limit(5))
# Azure storage access info
blob_account_name = "azureopendatastorage"
blob_container_name = "citydatacontainer"
blob_relative_path = "Safety/Release/city=SanFrancisco"
blob_sas_token = r""
# Allow SPARK to read from Blob remotely
wasbs_path = 'wasbs://%s@%s.blob.core.windows.net/%s' % (blob_container_name, blob_account_name, blob_relative_path)
spark.conf.set(
'fs.azure.sas.%s.%s.blob.core.windows.net' % (blob_container_name, blob_account_name),
blob_sas_token)
print('Remote blob path: ' + wasbs_path)
# SPARK read parquet, note that it won't load any data yet by now
df = spark.read.parquet(wasbs_path)
print('Register the DataFrame as a SQL temporary view: source')
df.createOrReplaceTempView('source')
# Display top 10 rows
print('Displaying top 10 rows: ')
display(spark.sql('SELECT * FROM source LIMIT 10'))
Azure Synapse
# This is a package in preview.
from azureml.opendatasets import SanFranciscoSafety
from datetime import datetime
from dateutil import parser
end_date = parser.parse('2016-01-01')
start_date = parser.parse('2015-05-01')
safety = SanFranciscoSafety(start_date=start_date, end_date=end_date)
safety = safety.to_spark_dataframe()
# Display top 5 rows
display(safety.limit(5))
# Azure storage access info
blob_account_name = "azureopendatastorage"
blob_container_name = "citydatacontainer"
blob_relative_path = "Safety/Release/city=SanFrancisco"
blob_sas_token = r""
# Allow SPARK to read from Blob remotely
wasbs_path = 'wasbs://%s@%s.blob.core.windows.net/%s' % (blob_container_name, blob_account_name, blob_relative_path)
spark.conf.set(
'fs.azure.sas.%s.%s.blob.core.windows.net' % (blob_container_name, blob_account_name),
blob_sas_token)
print('Remote blob path: ' + wasbs_path)
# SPARK read parquet, note that it won't load any data yet by now
df = spark.read.parquet(wasbs_path)
print('Register the DataFrame as a SQL temporary view: source')
df.createOrReplaceTempView('source')
# Display top 10 rows
print('Displaying top 10 rows: ')
display(spark.sql('SELECT * FROM source LIMIT 10'))

City Safety
From the Urban Innovation Initiative at Microsoft Research, databricks notebook for analytics with safety data (311 and 911 call data) from major U.S. cities. Analyses show frequency distributions and geographic clustering of safety issues within cities.