Project Assignment 3¶

Introduction¶

The overall goal of my project is to analyze the Social Vulnerability Index (SVI) for counties in the U.S. using the 2022 SVI dataset. The Social Vulnerability Index measure how vulnerable each community to disaters, economic stress, and public health problems. With analyzing this dataset, I can find patterns in vulnerability across counties and find locations that might need more resources and support. Learning to understand these patterns can allow policymakers and emergency planners improve disater readiness and community resilience.

Dataset Description¶

The dataset that I used for this project is the 2022 Social Vilnerability index dataset at the county level. This dataset was made by the CDC and has vulnerability scores for counties all across the U.S. Each county is given a score from 0 to 1, with the higher values meaning larger vulnerability. The main variable I will analyze is the RPL_THEMES, which represents the overall social vulnerability score.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

svi = pd.read_csv("SVI_2022_US_county.csv")
svi.head()

Inspecting the Dataset¶

svi.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3144 entries, 0 to 3143
Columns: 158 entries, ST to MP_OTHERRACE
dtypes: float64(75), int64(79), object(4)
memory usage: 3.8+ MB

svi.describe()

Checking for NA values and exploring important variables

svi.isnull().sum()

ST              0
STATE           0
ST_ABBR         0
STCNTY          0
COUNTY          0
               ..
MP_NHPI         0
EP_TWOMORE      0
MP_TWOMORE      0
EP_OTHERRACE    0
MP_OTHERRACE    0
Length: 158, dtype: int64

svi[['STATE','COUNTY','RPL_THEMES']].head()

Distribution of Social Vulnerability Scores¶

plt.figure(figsize=(10,5))
plt.hist(svi['RPL_THEMES'], bins=10, edgecolor='black')
plt.title("Distribution of Social Vulnerability Index Scores")
plt.xlabel("SVI Score")
plt.ylabel("Number of Counties")

plt.show()

Because the SVI variable is a percentile ranking, the values are almost distrubited equally from 0 to 1. WHen the data is a uniform distribution, each bin contains roughly the same number of observations.

Identifying Counties with the Highest Vulnerability and Visualization¶

Sorts the dataset to allow me to find the counties with the largest vulnerability score.

top_counties = svi.sort_values(by='RPL_THEMES', ascending=False)
top_counties[['STATE','COUNTY','RPL_THEMES']].head(10)

Since the top ten counties have very similar score close the 1, the x-axis is zoomed in to show the difference between counties. The chart show that Madison Parish In Louisiana has the highest vulnerability score and then goes down a the list to show the others counties scores.

plt.figure(figsize=(10,6))
plt.barh(top_counties.head(10)['COUNTY'], top_counties.head(10)['RPL_THEMES'], color = 'steelblue')
plt.title("Top 10 Counties with Highest Social Vulnerability")
plt.xlabel("SVI Score")
plt.xlim(0.995, 1.001)
plt.gca().invert_yaxis()

plt.show()

State-Level Vulnerability Patterns or Visualization¶

This groups the counties by states and finds the average vulnerability by each state.

state_avg = svi.groupby('STATE')['RPL_THEMES'].mean().sort_values(ascending=False)
state_avg.head(10)

STATE
Arizona           0.829213
Mississippi       0.806721
New Mexico        0.794094
Louisiana         0.772586
South Carolina    0.763543
Florida           0.708306
Texas             0.706484
Georgia           0.706448
California        0.690778
Arkansas          0.690255
Name: RPL_THEMES, dtype: float64

The visualization show exactly which states have the highest average vulnerability levels across the different counties.

plt.figure(figsize=(10,6))
state_avg.head(10).plot(kind='bar')
plt.title("States with Highest Average Social Vulnerability")
plt.ylabel("Average SVI Score")

plt.show()

Results¶

The first graph that I made shows the SVI scores for counties in the U.S. These values are from 0 to 1, and the bars are almost the same height across the full graph. This is becasue the SVI score is actually based on percentile rankings, which spreads the counties fairly across the range of values.

The second graph that I made shows the top ten counties with the highest social vulnerability score. I found that Madison Parish in Louisiana has the highest score followed by Luna County in New Mexico and Dimmit County in Texas. All of the top ten counties have scores very close to 1 with the losest one being higher the 0.997, which means they all rank among the most vulnerable counties in the country. This counties might be facing different or similar challenges like poverty, limited healthcare, and weak infrastructure.

The last graph that I made shows the states with the largest average social vulnerability scores. The graph shows that Arizona has the biggest average score, followed by Mississippi and New Mexico. Some other states also appear in the top ten like Louisiana, South Carolina, Florida, and Texas. This implies that vulnerability could be more common in certain areas of the country.

Overall, the visualization and analysis shows that while SVI scores are evenly spread across the counties, some counties and states continue to appear consistently among the most vulnerable.

Conclusion¶

In conclusion for this project I analyzed the Social Vulnerability Index (SVI)for counties in the U.S. using the 2022 dataset. The graph I made helped show how vulnerability scores are distributed across counties and which counties have the highest scores. Counties like Madison Parish in Louisiana, Luna City in New Mexico, and Dimmit County in Texas had the highest vulnerability score.

The state-level results showed that certain states, like Arizona, Mississippi, and New Mexico have the largest average when it comes to vulnerability levels. This shows that vulnerability could be more common in certain regions of the country. In geography, places nearby tend to have similar characteristics. Because of this, counties that are nearby one another may show similar vulnerability patterns. By looking at these spatial patterns it could help find regions where communities may need more support and resources.

	ST	STATE	ST_ABBR	STCNTY	COUNTY	FIPS	LOCATION	AREA_SQMI	E_TOTPOP	...	EP_ASIAN	MP_ASIAN	EP_AIAN	MP_AIAN	EP_NHPI	MP_NHPI	EP_TWOMORE	MP_TWOMORE	EP_OTHERRACE	MP_OTHERRACE
0	1	Alabama	AL	1001	Autauga County	1001	Autauga County, Alabama	594.454786	58761	...	1.1	0.4	0.1	0.1	0.0	0.1	3.3	1.0	0.2	0.3
1	1	Alabama	AL	1003	Baldwin County	1003	Baldwin County, Alabama	1589.861817	233420	...	0.9	0.1	0.2	0.1	0.0	0.1	3.1	0.4	0.4	0.3
2	1	Alabama	AL	1005	Barbour County	1005	Barbour County, Alabama	885.007619	24877	...	0.5	0.1	0.3	0.1	0.0	0.1	1.8	0.7	1.2	0.8
3	1	Alabama	AL	1007	Bibb County	1007	Bibb County, Alabama	622.469286	22251	...	0.3	0.4	0.1	0.1	0.0	0.2	1.7	1.0	0.1	0.1
4	1	Alabama	AL	1009	Blount County	1009	Blount County, Alabama	644.890376	59077	...	0.2	0.2	0.1	0.1	0.2	0.2	2.8	0.7	0.1	0.1

	ST	STCNTY	FIPS	AREA_SQMI	E_TOTPOP	M_TOTPOP	E_HU	M_HU	E_HH	M_HH	...	EP_ASIAN	MP_ASIAN	EP_AIAN	MP_AIAN	EP_NHPI	MP_NHPI	EP_TWOMORE	MP_TWOMORE	EP_OTHERRACE	MP_OTHERRACE
count	3144.000000	3144.000000	3144.000000	3144.000000	3.144000e+03	3144.000000	3.144000e+03	3144.000000	3.144000e+03	3144.000000	...	3144.000000	3144.000000	3144.000000	3144.000000	3144.000000	3144.000000	3144.000000	3144.000000	3144.000000	3144.000000
mean	30.264313	30368.187023	30368.187023	1123.812856	1.053109e+05	5.687023	4.482939e+04	66.367366	3.999248e+04	378.694975	...	1.409606	0.387182	1.709192	0.413518	0.091985	0.356934	3.105821	0.874491	0.264695	0.423282
std	15.152668	15170.427484	15170.427484	3592.773606	3.337924e+05	29.924606	1.320087e+05	64.258691	1.214871e+05	367.587221	...	2.889120	0.766913	7.489594	1.119786	0.472447	1.142309	1.968423	1.203668	0.384044	1.354489
min	1.000000	1001.000000	1001.000000	2.046442	5.000000e+01	0.000000	5.500000e+01	11.000000	3.200000e+01	20.000000	...	0.000000	0.100000	0.000000	0.100000	0.000000	0.100000	0.000000	0.100000	0.000000	0.100000
25%	18.000000	18174.500000	18174.500000	430.968649	1.083575e+04	0.000000	5.238250e+03	30.000000	4.139750e+03	173.000000	...	0.300000	0.100000	0.100000	0.100000	0.000000	0.100000	2.000000	0.400000	0.000000	0.100000
50%	29.000000	29174.000000	29174.000000	615.625786	2.578450e+04	0.000000	1.231600e+04	47.000000	9.944000e+03	276.000000	...	0.600000	0.200000	0.200000	0.100000	0.000000	0.100000	2.800000	0.600000	0.200000	0.200000
75%	45.000000	45079.500000	45079.500000	924.245980	6.807975e+04	0.000000	3.153475e+04	77.000000	2.643475e+04	449.000000	...	1.300000	0.400000	0.500000	0.400000	0.100000	0.300000	3.700000	1.000000	0.400000	0.400000
max	56.000000	56045.000000	56045.000000	145575.491748	9.936690e+06	328.000000	3.599561e+06	931.000000	3.363093e+06	4811.000000	...	41.600000	20.500000	90.900000	37.500000	12.000000	37.500000	23.400000	37.500000	9.200000	41.900000

	STATE	COUNTY	RPL_THEMES
0	Alabama	Autauga County	0.2663
1	Alabama	Baldwin County	0.3487
2	Alabama	Barbour County	0.9927
3	Alabama	Bibb County	0.8451
4	Alabama	Blount County	0.6166

	STATE	COUNTY	RPL_THEMES
1147	Louisiana	Madison Parish	1.0000
1813	New Mexico	Luna County	0.9997
2588	Texas	Dimmit County	0.9994
1134	Louisiana	Evangeline Parish	0.9990
120	Arkansas	Chicot County	0.9987
1945	North Carolina	Lenoir County	0.9984
334	Florida	DeSoto County	0.9981
111	Arizona	Yuma County	0.9978
1484	Mississippi	Yazoo County	0.9975
1478	Mississippi	Washington County	0.9971