Project Assignment 3

Pratheeksha Nayak

Identifying hotspots of vulnerability during hurricane storm-surge inundation (Python implementation)

Motivation

  • Storm surges during hurricanes can cause significant flooding in coastal and inland regions resulting in damage to property and loss of life.
  • Identifying regions that are most vulnerable to such damage is essential for improving disaster response and building community resilience.
  • Timely evacuation during such disasters are crucial for saving lives and understanding the impact on infrastructure can help better planning and preparedness.

Approach

In this project, a comprehensive two-part analysis is conducted to understand the impact of storm-surge inundation:

  1. For Planners / Government Agencies: The first part focuses on providing spatial information for planners and government agencies. It utilizes choropleth maps to visually depict the distribution of populations affected by storm surges across various census tracts of Washington, DC. The data mapped includes key demographic, economic and housing characteristics such as total population, median age, population segments over 65 and under 10 years of age, total housing units, housing units built before 1950, median income, and the percentage of the population living below the poverty level. These visualizations are designed to help decision-makers prioritize resources and tailor their response strategies for different categories of storms, thus enabling a more effective, data-driven approach to disaster preparedness and mitigation.

  2. For Citizens: The second part adopts a citizen-centric approach, aiming to empower individuals with actionable information in the event of a storm. By allowing users to input a specific point location, the analysis identifies the category of storm that would affect that area. It also shows the pharmacies within a one-mile radius, highlighting those that would be inundated by storm surge waters. This information can be vital for individuals needing access to essential services like pharmacies during emergencies.


Part 1: For Planners / Government Agencies

Data Sources

  1. Storm surge risk areas(category-wise) in DC:
  • Source: Open Data DC portal ((https://opendata.dc.gov/)).
  • This data contains shapefiles that give areas in DC that are at risk of inundation for different categories of storms.
  • It uses the SLOSH (Sea, Lake and Overland Surge from Hurricanes) model by NOAA. It simulates storm-surge heights and extents by utilizing tens of thousands of hypothetical storms across 27 basins along the east and gulf coasts of the United States. Parameters such as storm intensity, direction, landfall location, wind etc. are varied for different categories of storms to generate the results.
  1. ACS 5-year data for DC census tracts:
  • Source: U.S.Census Bureau, American Community Survey (ACS) data in Open Data DC portal.
  • This data contains shapefiles of census data that include information on population, housing etc. within each census tract in DC.
  • 3 categories of data were used:
    • Demographic: Total population, Median age in years, Population over 65 years, Population under 10 years
    • Economic: Median income in dollars, Percentage below poverty level
    • Housing: Total housing units, Housing units built before 1950
  • Integrating the output of SLOSH model with socio-economic data can provide valuable insights into the affected regions. They can help identify populations at highest risk and help in guiding evacuation efforts and prioritizing resource allocation.
In [1]:
#Import required libraries

import matplotlib.pyplot as plt
import pandas as pd
from geopandas import read_file, gpd
import folium
import ipywidgets as widgets
from IPython.display import display

Storm surge Risk Areas data

In [2]:
input_file = './Data/StormSurgeRiskAreas/Storm_Surge_Risk_Areas.shp'
surge = gpd.read_file(input_file)
# surge.tail()
In [3]:
surge['HES_ZONE'].replace(99, 5, inplace=True)
print(surge.crs)
surge.tail()
epsg:4326
Out[3]:
HES_ZONE CONTOURLEN GIS_ID TARGET_FID JOIN_COUNT GLOBALID CREATOR CREATED EDITOR EDITED OBJECTID SHAPEAREA SHAPELEN geometry
14 4 1.038773e+04 WaterSurgePly_15 2512 1 {BAE713E3-84EE-4146-9221-7024967C1AF5} None None None None 15 0 0 POLYGON Z ((-77.03744 38.87997 0.00000, -77.03...
15 2 8.843568e+03 WaterSurgePly_16 2440 1 {30D8719E-77A6-4E2F-BF43-3B56E898B408} None None None None 16 0 0 POLYGON Z ((-76.99782 38.86592 0.00000, -76.99...
16 2 2.722179e+07 WaterSurgePly_17 5273 1 {98DE8302-25A5-40E5-8C22-C294E8CA1FBB} None None None None 17 0 0 MULTIPOLYGON Z (((-77.02075 38.81607 0.00000, ...
17 5 7.537968e+03 WaterSurgePly_18 2311 1 {D63F9A96-01FF-4640-ACA2-89AEE84740F4} None None None None 18 0 0 POLYGON Z ((-77.02122 38.84051 0.00000, -77.02...
18 1 9.075484e+03 WaterSurgePly_19 2222 1 {2B16A555-F720-40F0-B25C-C3EE86A5BF9D} None None None None 19 0 0 POLYGON Z ((-77.02183 38.81387 0.00000, -77.02...
In [ ]:
center_lat, center_lon = 38.905996,-77.043686
m1 = folium.Map(location=[center_lat, center_lon], tiles="CartoDB positron", zoom_start=12)
for i in range(1,6):
    m1 = surge[surge['HES_ZONE']==i].explore(m=m1, name=f"category {i}")
folium.LayerControl().add_to(m1)
m1

ACS data for DC census tracts

In [ ]:
#Demographic
input_file_1 = './Data/ACS_Demographic/ACS_5-Year_Demographic_Characteristics_DC_Census_Tract.shp'
acs_demographic = gpd.read_file(input_file_1)

#Economic
input_file_2 = './Data/ACS_Economic/ACS_5-Year_Economic_Characteristics_DC_Census_Tract.shp'
acs_economic = gpd.read_file(input_file_2)

#Housing
input_file_3 = './Data/ACS_Housing/ACS_5-Year_Housing_Characteristics_DC_Census_Tract.shp'
acs_housing = gpd.read_file(input_file_3)
In [ ]:
# Check projections
print('Demographic:', acs_demographic.crs)
print('Economic:', acs_economic.crs)
print('Housing:', acs_housing.crs)
In [ ]:
# Inspect the demographic data
acs_demographic.head()

Data Cleaning

  • For the 3 ACS datasets:
    • Identify attributes of interest and filter the data.
    • Rename attributes to have more understandable names.
    • Feature engineering to create new columns that are relevant to the study.
    • Drop redundant columns.
  • Merge the 3 ACS datasets to get one final dataset based on GEOID.
  • Drop columns will null values.
In [ ]:
demographic = acs_demographic[['GEOID', 'geometry', 'DP05_0001E', 'DP05_0018E', 'DP05_0005E', 'DP05_0006E', 'DP05_0024E']]
demographic = demographic.rename(columns ={'DP05_0001E':'total_pop', 'DP05_0018E':'median_age', 'DP05_0005E':'pop_under_5', 'DP05_0006E':'pop_5_9', 'DP05_0024E':'pop_over_65'})
demographic['pop_under_10'] = demographic['pop_under_5'] + demographic['pop_5_9']
demographic.drop(columns=['pop_under_5', 'pop_5_9'], inplace=True)
demographic
In [ ]:
# acs_economic.head()
In [ ]:
economic = acs_economic[['GEOID', 'geometry', 'DP03_0062E', 'DP03_0119P']]
economic = economic.rename(columns ={'DP03_0062E':'median_income', 'DP03_0119P':'percent_below_poverty', 'DP05_0005E':'pop_under_5', 'DP05_0006E':'pop_5_9', 'DP05_0024E':'pop_over_65'})
economic
In [ ]:
# acs_housing.head()
In [ ]:
housing = acs_housing[['GEOID', 'geometry', 'DP04_0001E', 'DP04_0025E', 'DP04_0026E']]
housing = housing.rename(columns ={'DP04_0001E':'total_units', 'DP04_0025E':'built_1949_1940', 'DP04_0026E':'built_before_1940'})
housing['built_before_1950'] = housing['built_1949_1940'] + housing['built_before_1940']
housing.drop(columns=['built_1949_1940', 'built_before_1940'], inplace=True)
housing
In [ ]:
# Combine the ACS datasets
merged_data = demographic.merge(economic, on=['GEOID', 'geometry'])
acs_data = merged_data.merge(housing, on=['GEOID', 'geometry'])
acs_data.dropna(inplace=True)
acs_data.head()
In [ ]:
# Visualize the total population in each DC census tract
ax = acs_data.plot(figsize=(15, 10), column='total_pop', legend='true')

Data Analysis

  • Surge Areas: The shapefile gives areas at the highest risk of inundation for a given category and higher storms. The assumption is made that if an area is inundated in a given category, then it is also inundated in higher category storms. So inundated areas for each category are extracted by combining areas at risk for the given category and lower category storms. For instance, areas at risk for a category 3 storm would be obtained by combining areas for category 1-3 storms.
  • ACS data: Clipped to inundation area for each category of storm.
In [ ]:
def get_category_surge(category):
    surge_category = surge[surge['HES_ZONE']<=category]
    return surge_category

def get_category_acs(category):
    acs_data_category = gpd.clip(acs_data, get_category_surge(category))
    return acs_data_category

Further, we create visualizations to help understand the spatial distribution of various attributes of interest. These attributes indicate the impact of the storm of people and infrastructure. Visualization are enabled using choropleth maps for each storm category to help planners understand the resources required for evacuation and preparedness.

In [ ]:
# Create two dropdowns for storm category and attribute to be visulaized

storm_category = widgets.Dropdown(
    options=['1', '2', '3', '4', '5'],
    value='5',
    description='Category of storm',
    disabled=False,
)

quantity = widgets.Dropdown(
    options=[('Total population', 'total_pop'), ('Median age', 'median_age'), 
             ('Population over 65', 'pop_over_65'), ('Population under 10', 'pop_under_10'), 
             ('Median income', 'median_income'), ('Percentage below poverty', 'percent_below_poverty'),
             ('Total housing units', 'total_units'), ('Units built before 1950', 'built_before_1950')],
    value='total_pop',
    description='Quantity to plot',
)
In [ ]:
# Function to get the maps and display them

def get_category_map(button):
    with output:
        output.clear_output()      # Clear previous output
        surge_category = get_category_surge(int(storm_category.value))
        acs_data_category = get_category_acs(int(storm_category.value))

        m = surge_category.explore(name="storm surge areas", tiles="CartoDB positron")
        cols = acs_data_category.columns[2:]

        acs_data_category.explore(m=m, name=quantity.value, column=quantity.value, 
                                  cmap="Reds", scheme="naturalbreaks", k=5,
                                  style_kwds=dict(color="black", weight=1, opacity=0.5, fillOpacity=1.0),
                                  legend=True, legend_kwds=dict(colorbar=True),
                                  popup=list(cols), tooltip=False,
                                  )

        folium.TileLayer("openstreetmap", control=True, opacity=0.5, name="openstreetmap").add_to(m)
        folium.LayerControl().add_to(m)
        display(m)
In [ ]:
# Arrange the dropdowns in the same line using HBox
dropdown_box = widgets.HBox([storm_category, quantity])

# Display the dropdowns
display(dropdown_box)

output = widgets.Output()

# Create the button for generating maps
button = widgets.Button(description='Get Plot')

# Set up the button click event
button.on_click(get_category_map)

# Display the button and output area
display(button)
display(output)

Results

In [ ]:
# Plot the total population and housing units impacted by each category of storm

affected_cols = ['total_pop', 'pop_over_65', 'pop_under_10', 'total_units', 'built_before_1950']

affected={}
for category in range(1,6):
    acs_category = get_category_acs(category)
    affected[category] = {}
    
    for col in affected_cols:
        affected[category][col] = sum(acs_category[col])

affected_df = pd.DataFrame(affected).T
axes = affected_df.plot.bar(rot=0, subplots=True, figsize=(10,20))
In [ ]:
affected_df = pd.DataFrame(affected)
axes = affected_df.plot.bar(rot=0, figsize=(12,10))
plt.ylabel('Number affected', size=24);
plt.legend(title="Category of storm", fontsize=20, fancybox=True)
axes.tick_params(axis='x', labelsize=20)  # Increase x-axis tick font size
axes.tick_params(axis='y', labelsize=20)  # Increase y-axis tick font size

-

Key observations

  • As the category of storm increases, the impact is higher. There is a significantly higher rise in numbers between category 2 to category 3 storms.

  • Certain areas may require additional resources and time for evacuation:

    • Areas with high population (however, population density could be more indicative of bottlenecks). Specifically, areas where elderly people (over 65 years) and children (under 10 years) reside in larger numbers.
    • Areas with large numbers of older housing units (built before 1950).
    • Regions with lower median income or higher poverty levels (since recoverability from damage could be lower).
In [ ]: