Spatial and temporal analysis of greenhouse gas emissions transfers over the US electric grid

Guangyu Cai, Joe Franke, Amina Jackson, Jenni Nugent

Research Questions

  1. How is electricity transferred on the U.S. electric grid between balancing authorities on a sub-annual timescale?
  2. What is the spatial distribution of emissions and how does it compare to median household income?

Project Motivation

  • Spatial and temporal transfer of resources has important policy and environmental justice implications.

Data Sources

Workflow Overview

  • Import Packages
  • U.S. Energy Information Administration Data Preparation
  • Greenhouse Gas Emissions Estimate
  • Data Preparation
    • Data Preparation Visualization 1: Balancing Authority Data
    • Data Preparation Visualization 2: Emissions and Location Dataframe
    • Data Preparation Visualization 3: Emissions and Fuel Type Dataframe
    • Data Preparation Visualization 3: Census Data Preparation
  • Visualizations
    • Visualization 1: Map of BA-to-BA electric transfers
    • Visualization 2: Map of Point Source GHG Emissions
    • Visualization 3: Map of Power Plants & Fuel Type and Median Household Income
    • Visualization 4: Spatial visualization of BA-to-BA interchange

Import Packages

In [ ]:
#the pakages needed using conda

#conda install -c conda-forge geoplot 
#conda install pysal
#conda install -c conda-forge pyvis

#or

#pip install geoplot
#pip install pysal
#pip install pyvis
In [ ]:
#Note:
#unzip control_areas.geojson in BA Geometry folder
In [1]:
import pandas as pd
import math
import geopandas as gpd
import matplotlib.pyplot as plt
%matplotlib inline
#pakages needed to add
import networkx as nx
import geoplot as gplt
from shapely.geometry import Point
from geopandas import GeoDataFrame
import matplotlib.patheffects as pe
from matplotlib.colors import LinearSegmentedColormap
import mapclassify
import plotly.express as px
import contextily as cx
import plotly.graph_objects as go

U.S. Energy Information Administration Data Preparation

  • Import power plant location and balancing authority data from EIA-860
  • Import plant-level fuel type, fuel consumption, and electricity generation data from EIA-923
    • Transform monthly data from wide format to long format
In [2]:
# Import plant location data from Form EIA-860
plantData_2018 = pd.read_csv('./EIA Data/2___Plant_Y2018.csv',header=1)
keep_columns1 = (['Plant Code','Latitude',
                  'Longitude','Balancing Authority Code'])
plantData_2018 = plantData_2018[keep_columns1] 
plantData_2018.head()
Out[2]:
Plant Code Latitude Longitude Balancing Authority Code
0 2 33.458665 -87.356823 SOCO
1 3 31.006900 -88.010300 SOCO
2 4 32.583889 -86.283056 SOCO
3 7 34.012800 -85.970800 SOCO
4 8 33.644344 -87.196486 SOCO
In [3]:
# Import monthly, plant-level fuel cosnumption and generation data from Form EIA-923
fuelAndGeneration_2018 = pd.read_csv('./EIA Data/EIA923_Schedules_2_3_4_5_M_12_2018_Final_Revision.csv',header=5)
fuelAndGeneration_2018.rename(columns = {'Plant Id': 'Plant ID',
                                         'Combined Heat And\nPower Plant': 'Combined Heat & Power Plant',
                                         'Plant State': 'State',
                                         'Reported\nPrime Mover': 'Reported Prime Mover', 
                                         'Reported\nFuel Type Code':'Reported Fuel Type Code',
                                         'Elec_MMBtu\nJanuary': 'Elec_MMBtu_Jan', 
                                         'Elec_MMBtu\nFebruary': 'Elec_MMBtu_Feb', 
                                         'Elec_MMBtu\nMarch': 'Elec_MMBtu_Mar',
                                         'Elec_MMBtu\nApril': 'Elec_MMBtu_Apr', 
                                         'Elec_MMBtu\nMay': 'Elec_MMBtu_May', 
                                         'Elec_MMBtu\nJune': 'Elec_MMBtu_Jun',
                                         'Elec_MMBtu\nJuly': 'Elec_MMBtu_Jul', 
                                         'Elec_MMBtu\nAugust': 'Elec_MMBtu_Aug', 
                                         'Elec_MMBtu\nSeptember': 'Elec_MMBtu_Sep',
                                         'Elec_MMBtu\nOctober': 'Elec_MMBtu_Oct', 
                                         'Elec_MMBtu\nNovember': 'Elec_MMBtu_Nov', 
                                         'Elec_MMBtu\nDecember': 'Elec_MMBtu_Dec',
                                         'Netgen\nJanuary':'Netgen_Jan', 
                                         'Netgen\nFebruary':'Netgen_Feb', 
                                         'Netgen\nMarch':'Netgen_Mar',
                                         'Netgen\nApril':'Netgen_Apr',
                                         'Netgen\nMay':'Netgen_May', 
                                         'Netgen\nJune':'Netgen_Jun', 
                                         'Netgen\nJuly':'Netgen_Jul', 
                                         'Netgen\nAugust':'Netgen_Aug',
                                         'Netgen\nSeptember':'Netgen_Sep', 
                                         'Netgen\nOctober':'Netgen_Oct', 
                                         'Netgen\nNovember':'Netgen_Nov', 
                                         'Netgen\nDecember':'Netgen_Dec'}, inplace = True) 
In [4]:
# Create a function that converts strings to floats in EIA-923 Files
def string2Float(dfRow):
    if type(dfRow) == str:
        if dfRow == '.': #replace missing values with 0
            x = 0
        elif dfRow == 'nan': #replace missing values with 0
            x = 0
        else:
            x = float(dfRow.replace(',','')) #remove commas from values and convert to float type
    if type(dfRow) == float:
        if math.isnan(dfRow): #replace missing values with 0
            x = 0
        else:
            x = dfRow
    return x
In [5]:
# Convert String Numbers to Float Values
convert_columns = (['Elec_MMBtu_Jan', 'Elec_MMBtu_Feb', 'Elec_MMBtu_Mar', 'Elec_MMBtu_Apr', 
                    'Elec_MMBtu_May','Elec_MMBtu_Jun', 'Elec_MMBtu_Jul', 'Elec_MMBtu_Aug', 
                    'Elec_MMBtu_Sep', 'Elec_MMBtu_Oct','Elec_MMBtu_Nov', 'Elec_MMBtu_Dec',
                    'Netgen_Jan', 'Netgen_Feb', 'Netgen_Mar', 'Netgen_Apr', 
                    'Netgen_May','Netgen_Jun', 'Netgen_Jul', 'Netgen_Aug', 
                    'Netgen_Sep', 'Netgen_Oct','Netgen_Nov', 'Netgen_Dec'])
for col in convert_columns:
    fuelAndGeneration_2018[col] = fuelAndGeneration_2018.apply(lambda row: string2Float(row[col]),axis=1)
#fuelAndGeneration_2018
In [6]:
# Keep generation data and create dataframe of generation
keep_columns2 = (['Plant ID', 'Combined Heat & Power Plant', 'Plant Name',
                 'State', 'NERC Region', 'Reported Prime Mover', 'Reported Fuel Type Code',
                 'Netgen_Jan', 'Netgen_Feb', 'Netgen_Mar', 'Netgen_Apr', 
                 'Netgen_May','Netgen_Jun', 'Netgen_Jul', 'Netgen_Aug', 
                 'Netgen_Sep', 'Netgen_Oct','Netgen_Nov', 'Netgen_Dec'])
generation_2018 = fuelAndGeneration_2018[keep_columns2] 
In [7]:
# Keep fuel consumption data and create dataframe of fuel consumption
keep_columns3 = (['Plant ID', 'Combined Heat & Power Plant', 'Plant Name',
                 'State', 'NERC Region', 'Reported Prime Mover', 'Reported Fuel Type Code',
                 'Elec_MMBtu_Jan', 'Elec_MMBtu_Feb', 'Elec_MMBtu_Mar', 'Elec_MMBtu_Apr', 
                 'Elec_MMBtu_May','Elec_MMBtu_Jun', 'Elec_MMBtu_Jul', 'Elec_MMBtu_Aug', 
                 'Elec_MMBtu_Sep', 'Elec_MMBtu_Oct','Elec_MMBtu_Nov', 'Elec_MMBtu_Dec'])
fuel_2018 = fuelAndGeneration_2018[keep_columns3] 
In [8]:
# Melt generation data from wide to long format
genMonthsList = ['Netgen_Jan', 'Netgen_Feb', 'Netgen_Mar', 'Netgen_Apr', 
                 'Netgen_May','Netgen_Jun', 'Netgen_Jul', 'Netgen_Aug', 
                 'Netgen_Sep', 'Netgen_Oct','Netgen_Nov', 'Netgen_Dec']
monthNum1 = list(range(1, 13))
genMonths = dict(zip(genMonthsList,monthNum1))
netGenMelted = generation_2018.melt(id_vars=['Plant ID', 'Combined Heat & Power Plant', 'Plant Name', 
                                'State', 'NERC Region', 'Reported Prime Mover', 'Reported Fuel Type Code'], 
                                 value_vars=genMonthsList, 
                                 value_name='Net Generation (MWh)',
                                 var_name= 'Month')
generation_2018 = netGenMelted.assign(Month=netGenMelted['Month'].map(genMonths)) 
#generation_2018
In [9]:
#Melt fuel consumption from wide to long format
fuelMonthsList = ['Elec_MMBtu_Jan', 'Elec_MMBtu_Feb', 'Elec_MMBtu_Mar', 'Elec_MMBtu_Apr', 
                 'Elec_MMBtu_May','Elec_MMBtu_Jun', 'Elec_MMBtu_Jul', 'Elec_MMBtu_Aug', 
                 'Elec_MMBtu_Sep', 'Elec_MMBtu_Oct','Elec_MMBtu_Nov', 'Elec_MMBtu_Dec']
monthNum1 = list(range(1, 13))
fuelMonths = dict(zip(fuelMonthsList,monthNum1))
fuelMelted = fuel_2018.melt(id_vars=['Plant ID', 'Combined Heat & Power Plant', 'Plant Name', 
                                'State', 'NERC Region', 'Reported Prime Mover', 'Reported Fuel Type Code'], 
                                 value_vars=fuelMonthsList, 
                                 value_name='Electric Fuel Consumption (MMBtu)',
                                 var_name= 'Month')
fuel_2018 = fuelMelted.assign(Month=fuelMelted['Month'].map(fuelMonths)) 
#fuel_2018
In [10]:
# Merge long-format fuel data and long-format generation data
fuelAndGeneration_2018long =  fuel_2018.copy()
fuelAndGeneration_2018long['Net Generation (MWh)'] =  generation_2018['Net Generation (MWh)']
#fuelAndGeneration_2018long.to_csv('fuelAndGeneration_2018long1.csv')
fuelAndGeneration_2018long.head()
Out[10]:
Plant ID Combined Heat & Power Plant Plant Name State NERC Region Reported Prime Mover Reported Fuel Type Code Month Electric Fuel Consumption (MMBtu) Net Generation (MWh)
0 2 N Bankhead Dam AL SERC HY WAT 1 0.0 -36.0
1 3 N Barry AL SERC CA NG 1 56603.0 264566.0
2 3 N Barry AL SERC CT NG 1 5202919.0 493620.0
3 3 N Barry AL SERC CT SUB 1 0.0 0.0
4 3 N Barry AL SERC ST BIT 1 4770781.0 474858.0

Greenhouse Gas Emissions Estimate

  • Estimate GHG emissions based on EIA-923 fuel consumption data and emissions factors from the EPA
In [11]:
# Create Function for Estimating GHG emissions
def GHG_estimate(fuelType,fuelConsumption,netGeneration):
    CH4_GWP = 25
    N2O_GWP = 298
    G_TO_KG = 1/1000
    KG_TO_LBS = 1/2.204623  
    
    if fuelType == "NG": 
        """Natural Gas"""
        CO2eqv = 53.06 + 1.0*CH4_GWP*G_TO_KG + 0.10*N2O_GWP*G_TO_KG

    elif fuelType == "ANT": 
        """Anthracite Coal"""
        CO2eqv = 103.69 + 11*CH4_GWP*G_TO_KG + 1.6*N2O_GWP*G_TO_KG 
    elif fuelType == "BIT": 
        """Bituminous Coal"""
        CO2eqv = 93.28 + 11*CH4_GWP*G_TO_KG + 1.6*N2O_GWP*G_TO_KG 
    elif fuelType == "LIG": 
        """Lignite Coal"""
        CO2eqv = 97.72 + 11*CH4_GWP*G_TO_KG + 1.6*N2O_GWP*G_TO_KG 
    elif fuelType == "SUB": 
        """Sub-bituminous Coal"""
        CO2eqv = 97.17 + 11*CH4_GWP*G_TO_KG + 1.6*N2O_GWP*G_TO_KG 
    
    elif fuelType == "JF": 
        """Jet Fuel [Kerosene-Type Jet Fuel]"""
        CO2eqv = 72.22 + 3.0*CH4_GWP*G_TO_KG + 0.60*N2O_GWP*G_TO_KG
    elif fuelType == "PG": 
        """Propane Gas"""
        CO2eqv = 61.46 + 3.0*CH4_GWP*G_TO_KG + 0.60*N2O_GWP*G_TO_KG
    elif fuelType == "PC": 
        """Petroleum Coke - [Solid]"""
        CO2eqv = 102.41 + 32*CH4_GWP*G_TO_KG + 4.2*N2O_GWP*G_TO_KG     
    elif fuelType == "WDS": 
        """Wood/Wood Waste Solids [Wood and Wood Residuals]"""            
        CO2eqv = 93.80 + 7.2*CH4_GWP*G_TO_KG + 3.6*N2O_GWP*G_TO_KG 
    elif fuelType == "KER": 
        """Kerosene"""
        CO2eqv = 75.20 + 3.0*CH4_GWP*G_TO_KG + 0.60*N2O_GWP*G_TO_KG 
    elif fuelType == "DFO": 
        """Distillate Fuel Oil (including diesel, No. 1, No. 2, and No. 4 fuel oils) [No. 1, No. 2, No. 3]"""
        tempCO2 = (73.25 + 73.96 + 75.04)/3 #Average No.1, No.2, No.4
        CO2eqv = tempCO2 + 3.0*CH4_GWP*G_TO_KG + 0.60*N2O_GWP*G_TO_KG  
    elif fuelType == "LFG": 
        """Landfill Gas"""
        CO2eqv = 52.07 + 3.2*CH4_GWP*G_TO_KG + 0.63*N2O_GWP*G_TO_KG 
    elif fuelType == "AB": 
        """Agricultural By-Products"""
        CO2eqv = 118.17 + 32*CH4_GWP*G_TO_KG + 4.2*N2O_GWP*G_TO_KG 
    elif fuelType == "RFO": 
        """Residual Fuel Oil (incl. Nos. 5 & 6 fuel oils, and bunker C fuel oil) [No. 5, No.6]"""
        tempCO2 = (72.93 + 75.10)/2 #Average No.5, No.6
        CO2eqv = tempCO2 + 3.0*CH4_GWP*G_TO_KG + 0.60*N2O_GWP*G_TO_KG 
    elif fuelType == "TDF": 
        """Tire-Derived Fuels [Tires]"""
        CO2eqv = 85.97 + 32*CH4_GWP*G_TO_KG + 4.2*N2O_GWP*G_TO_KG  
    elif fuelType == "OBG": 
        """Other Biomass Gas"""
        CO2eqv = 52.07 + 3.2*CH4_GWP*G_TO_KG + 0.63*N2O_GWP*G_TO_KG 
    else:                
        CO2eqv = 0
    
    GHG = (CO2eqv * fuelConsumption) * KG_TO_LBS # ((kg/MMbtu) * MMBtu) 
    return round(GHG, 4)
In [12]:
# Estimate GHG Emissions
fuelAndGeneration_2018long['GHG Estimate (lbs)'] = fuelAndGeneration_2018long.apply(lambda row: GHG_estimate(row['Reported Fuel Type Code'],row['Electric Fuel Consumption (MMBtu)'],row['Net Generation (MWh)']),axis=1)
fuelAndGeneration_2018long = fuelAndGeneration_2018long[ fuelAndGeneration_2018long['GHG Estimate (lbs)'] > 0 ] 
#fuelAndGeneration_2018long.to_csv('fuelAndGeneration_2018long.csv')
#fuelAndGeneration_2018long
In [13]:
# Aggregate GHG Emissions by plant
fuelAndGeneration_2018_AggPlant = fuelAndGeneration_2018long.groupby(by = ['Month','Plant ID'],as_index=False).agg( {'GHG Estimate (lbs)': ['sum'],
                                                                                                                    'Net Generation (MWh)': ['sum']})

fuelAndGeneration_2018_AggPlant.columns = fuelAndGeneration_2018_AggPlant.columns.droplevel(1)
fuelAndGeneration_2018_AggPlant.head()
Out[13]:
Month Plant ID GHG Estimate (lbs) Net Generation (MWh)
0 1 3 3.385828e+08 1267682.0
1 1 7 4.145301e+06 14592.0
2 1 8 2.683336e+08 665047.0
3 1 9 1.432610e+06 3601.0
4 1 10 2.123426e+07 72463.0

Data Preparation Visualization 1: Balancing Authority Data

  • Match Balancing Authority Acronyms with Balancing authority shapefiles
  • Find the centroid of the balancing authorities
  • Import electricty transfer data between balancing authorities
In [14]:
# Import Balancing Authority Shape Data
balancing_Authorities = gpd.read_file('./BA Geometry/Control_Areas.geojson')
balancing_Authorities = balancing_Authorities.set_index('OBJECTID')
balancing_Authorities = balancing_Authorities[(balancing_Authorities['STATE'] != 'AK') & (balancing_Authorities['STATE'] != 'HI' )]
#print(balancing_Authorities.crs)
In [15]:
# Import Balacing Authority Acronym List
BA_Acronyms = pd.read_csv('./BA Geometry/Balancing Authorities EIA930 HIFDL 2018.csv')
BA_Acronyms

# Merge Balancing Authority Shape Data with Acronyms
BA_Acronyms = balancing_Authorities.merge(BA_Acronyms, left_on='NAME', right_on='HIFDL Name 2022', how='right')
BA_Acronyms = BA_Acronyms[['BA Code', 'NAME', 'ADDRESS', 'CITY', 'STATE', 'ZIP', 'geometry', 'Lat', 'Long']]

# Projection
BA_Acronyms_3857 = BA_Acronyms.to_crs(epsg=3857)

# Find centroids
BA_Acronyms_3857['centroid'] = BA_Acronyms_3857.centroid
In [16]:
BA_Acronyms_3857['polygon'] = BA_Acronyms_3857['geometry']

BA_Acronyms_3857['geometry'] = BA_Acronyms_3857['centroid']
BA_Acronyms_3857.head()
Out[16]:
BA Code NAME ADDRESS CITY STATE ZIP geometry Lat Long centroid polygon
0 AEC POWERSOUTH ENERGY COOPERATIVE 2027 EAST THREE NOTCH STREET ANDALUSIA AL 36421 POINT (-9611030.709 3729510.804) NaN NaN POINT (-9611030.709 3729510.804) MULTIPOLYGON (((-9742651.778 3539981.548, -974...
1 AECI ASSOCIATED ELECTRIC COOPERATIVE, INC. 2814 S GOLDEN AVE SPRINGFIELD MO 65807 POINT (-10358940.507 4601289.142) NaN NaN POINT (-10358940.507 4601289.142) MULTIPOLYGON (((-10074508.538 4224075.067, -10...
2 AVA AVISTA CORPORATION 1411 EAST MISSION AVENUE SPOKANE WA 99220 POINT (-13014863.554 5952876.520) NaN NaN POINT (-13014863.554 5952876.520) MULTIPOLYGON (((-13063826.484 6274913.995, -13...
3 AVRN AVANGRID RENEWABLES LLC 1125 NW COUCH ST., SUITE 700 PORTLAND OR 97209 POINT (-13444729.595 5393305.285) NaN NaN POINT (-13444729.595 5393305.285) MULTIPOLYGON (((-13467963.213 5406452.355, -13...
4 AZPS ARIZONA PUBLIC SERVICE COMPANY MS # 9997 PHOENIX AZ 85072 POINT (-12474930.473 3993335.322) NaN NaN POINT (-12474930.473 3993335.322) MULTIPOLYGON (((-12165073.555 3702720.955, -12...
In [17]:
BA_Acronyms_3857 = BA_Acronyms_3857.set_geometry('centroid')
BA_Acronyms_3857.head()
Out[17]:
BA Code NAME ADDRESS CITY STATE ZIP geometry Lat Long centroid polygon
0 AEC POWERSOUTH ENERGY COOPERATIVE 2027 EAST THREE NOTCH STREET ANDALUSIA AL 36421 POINT (-9611030.709 3729510.804) NaN NaN POINT (-9611030.709 3729510.804) MULTIPOLYGON (((-9742651.778 3539981.548, -974...
1 AECI ASSOCIATED ELECTRIC COOPERATIVE, INC. 2814 S GOLDEN AVE SPRINGFIELD MO 65807 POINT (-10358940.507 4601289.142) NaN NaN POINT (-10358940.507 4601289.142) MULTIPOLYGON (((-10074508.538 4224075.067, -10...
2 AVA AVISTA CORPORATION 1411 EAST MISSION AVENUE SPOKANE WA 99220 POINT (-13014863.554 5952876.520) NaN NaN POINT (-13014863.554 5952876.520) MULTIPOLYGON (((-13063826.484 6274913.995, -13...
3 AVRN AVANGRID RENEWABLES LLC 1125 NW COUCH ST., SUITE 700 PORTLAND OR 97209 POINT (-13444729.595 5393305.285) NaN NaN POINT (-13444729.595 5393305.285) MULTIPOLYGON (((-13467963.213 5406452.355, -13...
4 AZPS ARIZONA PUBLIC SERVICE COMPANY MS # 9997 PHOENIX AZ 85072 POINT (-12474930.473 3993335.322) NaN NaN POINT (-12474930.473 3993335.322) MULTIPOLYGON (((-12165073.555 3702720.955, -12...
In [18]:
#Import electricity transfer data
transferData_2018 = pd.read_csv('./Transfer Data/Total.Monthly.Transfers 2018 4.6.2022.csv')
transferData_2018.head()
Out[18]:
origin destination Year Month monthlyTransferMWh
0 AEC MISO 2018 1 119447
1 AEC MISO 2018 2 67662
2 AEC MISO 2018 3 41530
3 AEC MISO 2018 4 30003
4 AEC MISO 2018 5 50136
In [19]:
#import the us state data
usa_state = gpd.read_file('./State Data/tl_2020_us_state.shp')
#change the geoid column into int type
usa_state['GEOID'] = usa_state['GEOID'].astype(int)
usa_state.head()
Out[19]:
REGION DIVISION STATEFP STATENS GEOID STUSPS NAME LSAD MTFCC FUNCSTAT ALAND AWATER INTPTLAT INTPTLON geometry
0 3 5 54 01779805 54 WV West Virginia 00 G4000 A 62266296765 489206049 +38.6472854 -080.6183274 POLYGON ((-81.74725 39.09538, -81.74635 39.096...
1 3 5 12 00294478 12 FL Florida 00 G4000 A 138958484319 45975808217 +28.3989775 -082.5143005 MULTIPOLYGON (((-86.39964 30.22696, -86.40262 ...
2 2 3 17 01779784 17 IL Illinois 00 G4000 A 143778461053 6216594318 +40.1028754 -089.1526108 POLYGON ((-91.18529 40.63780, -91.17510 40.643...
3 2 4 27 00662849 27 MN Minnesota 00 G4000 A 206232157570 18949864226 +46.3159573 -094.1996043 POLYGON ((-96.78438 46.63050, -96.78434 46.630...
4 3 5 24 01714934 24 MD Maryland 00 G4000 A 25151895765 6979171386 +38.9466584 -076.6744939 POLYGON ((-77.45881 39.22027, -77.45866 39.220...
In [20]:
#change the state data coordinate reference system to EPSG3857
usa_state = usa_state.to_crs("EPSG:3857")
In [21]:
#only select the regions of US that we want to look into
usa_state['REGION'].unique()
usa_state.loc[usa_state['REGION'] != '9'].plot()
base = usa_state.loc[(usa_state['NAME'] != 'Hawaii') & (usa_state['NAME'] != 'Alaska') & (usa_state['REGION'] != '9')]

Visualization 1: Map of BA-to-BA electric transfers

  • How is electricity, and subsequently embedded resources and emissions, transferred around the U.S.?
In [22]:
#plot the balancing authority location over US map
fig, ax = plt.subplots(figsize=(50,50))

usa_state.loc[(usa_state['NAME'] != 'Hawaii') & (usa_state['NAME'] != 'Alaska') & (usa_state['REGION'] != '9')].plot(ax=ax,color='grey')

BA_Acronyms_3857.plot(column='BA Code', ax=ax, legend=True, markersize=300,marker='^',color='red')
plt.title('Balancing Authority location',size=100)
plt.show()
C:\Users\jenni\.conda\envs\geog489\lib\site-packages\geopandas\plotting.py:644: UserWarning: Only specify one of 'column' or 'color'. Using 'color'.
  warnings.warn(
In [23]:
#set the relation data to G
import networkx as nx
G = nx.from_pandas_edgelist(transferData_2018,source = 'origin', target = 'destination',edge_attr = 'monthlyTransferMWh')
In [24]:
#method 1 networkX
# all graph options
graphs_viz_options = [nx.draw, nx.draw_networkx, nx.draw_circular, nx.draw_kamada_kawai, nx.draw_random, nx.draw_shell, nx.draw_spring]

# plot graph option, can change from 0 to 6 to choose differernt plotting methods
selected_graph_option = 2
# plot
plt.figure(figsize=(30,30), dpi=50) 
graphs_viz_options[selected_graph_option](G)
In [25]:
#method 2
from pyvis.network import Network
#create vis network
net = Network(notebook = True)
#load the networkx graph
net.from_nx(G)
#show
net.show("example.html")
#the graph can be dragged to show the relationships between points
Out[25]:

Data Preparation Visualization 2: Emissions and Location Dataframe

  • Merge plant-level emissions data with plant location data
In [26]:
# Merge with plant-level GHG Emissions data with plant location data
fuelAndGeneration_2018_location =  pd.merge(fuelAndGeneration_2018_AggPlant,
                                                plantData_2018, 
                                                left_on=['Plant ID'], 
                                                right_on=['Plant Code'],
                                                how = 'left',
                                                suffixes =  ('', '_2'))
fuelAndGeneration_2018_location = fuelAndGeneration_2018_location[['Month', 'Plant ID', 'Latitude', 'Longitude', 'Balancing Authority Code',
                                                                           'GHG Estimate (lbs)', 'Net Generation (MWh)']] 
fuelAndGeneration_2018_location = fuelAndGeneration_2018_location[fuelAndGeneration_2018_location['Net Generation (MWh)'] > 0]
#fuelAndGeneration_2018_location.to_csv('fuelAndGeneration_2018_location.csv')
fuelAndGeneration_2018_location.head()
Out[26]:
Month Plant ID Latitude Longitude Balancing Authority Code GHG Estimate (lbs) Net Generation (MWh)
0 1 3 31.006900 -88.010300 SOCO 3.385828e+08 1267682.0
1 1 7 34.012800 -85.970800 SOCO 4.145301e+06 14592.0
2 1 8 33.644344 -87.196486 SOCO 2.683336e+08 665047.0
3 1 9 31.756900 -106.375000 EPE 1.432610e+06 3601.0
4 1 10 32.601700 -87.781100 SOCO 2.123426e+07 72463.0

Visualization 2: Map of Point Source GHG Emissions

  • How are greenhouse gases spatially emitted by electricty generation in the U.S.?
In [46]:
fig = px.scatter_geo(fuelAndGeneration_2018_location, 
                     lat = "Latitude",
                     lon = "Longitude",
                     hover_name="Plant ID", 
                     size="GHG Estimate (lbs)",
                     animation_frame="Month",
                     projection="albers usa")
fig.show()

Data Preparation Visualization 3: Emissions and Fuel Type Dataframe

  • Determine the primary fuel(s) used for each power plant
    • Aggregate various coal types together
    • By mass, most emissions come from coal and nattural gas
In [27]:
# Create a function to convert the codes of different coal type to a standard 'COAL'
def coal(fuelType):
    COAL_CODES = ['ANT','BIT','LIG','SUB','RC','WC']
    
    if fuelType in COAL_CODES:
        fuelType = 'COAL'
    
    return fuelType
In [28]:
# Convert the codes of different coal types to 'COAL'
fuelAndGeneration_2018long_Coal = fuelAndGeneration_2018long.copy()
fuelAndGeneration_2018long_Coal['Reported Fuel Type Code'] = fuelAndGeneration_2018long.apply(lambda row: coal(row['Reported Fuel Type Code']),axis=1)
#fuelAndGeneration_2018long_Coal
In [29]:
# Aggregate GHG Emissions by fuel type
emissions_byAllFuel_2018_Agg = fuelAndGeneration_2018long_Coal.groupby(by = ['Reported Fuel Type Code'],as_index=False).agg( {'GHG Estimate (lbs)': ['sum'],
                                                                                                                         'Net Generation (MWh)': ['sum']})

emissions_byAllFuel_2018_Agg.columns = emissions_byAllFuel_2018_Agg.columns.droplevel(1)
emissions_byAllFuel_2018_Agg['GHG emissions (lbs/MWh)'] = emissions_byAllFuel_2018_Agg['GHG Estimate (lbs)'] / emissions_byAllFuel_2018_Agg['Net Generation (MWh)']
ax = emissions_byAllFuel_2018_Agg.plot.bar(x='Reported Fuel Type Code', y='GHG Estimate (lbs)', rot=0)
#ax = emissions_byAllFuel_2018_Agg.plot.bar(x='Reported Fuel Type Code', y='GHG emissions (lbs/MWh)', rot=0)

Table 1: Fuel Type Codes

Fuel Code Fuel
AB Agricultural By-Products
COAL Coal
DFO Distillate Fuel Oil
JF Jet Fuel
KER Kerosene
LFG Landfill Gas
NG Natural Gas
OBG Other Biomass Gas
PC Petroleum Coke
PG Propane Gas
RFO Residual Fuel Oil
TDF Tire-Derived Fuels
WDS Wood/Wood Waste Solids
In [30]:
# Aggregate generation and emissions by plant and fuel type
fuelAndGeneration_2018_AggFuel = fuelAndGeneration_2018long_Coal.groupby(by = ['Month','Plant ID','Reported Fuel Type Code'],as_index=False).agg( {'GHG Estimate (lbs)': ['sum'],
                                                                                                                                              'Net Generation (MWh)': ['sum']})

fuelAndGeneration_2018_AggFuel.columns = fuelAndGeneration_2018_AggFuel.columns.droplevel(1)

# Keep emissions above a given threshold to get rid of secondary fuels
emissions_threshold3 = fuelAndGeneration_2018_AggFuel['GHG Estimate (lbs)'].quantile(.25)

fuelAndGeneration_2018_AggFuel_quantile = fuelAndGeneration_2018_AggFuel[fuelAndGeneration_2018_AggFuel['GHG Estimate (lbs)'] > emissions_threshold3]
#fuelAndGeneration_2018_AggFuel_quantile

# Keep emissions only for NG and Coal
fuelAndGeneration_2018_AggFuel_CoalNG = fuelAndGeneration_2018_AggFuel[fuelAndGeneration_2018_AggFuel['Reported Fuel Type Code'].isin(['NG','COAL']) ]
#fuelAndGeneration_2018_AggFuel_CoalNG
In [31]:
# Find the primary fuels for each plant

# What percentage of total plant generation is each fuel type responsible for?
fuelType_AggPlant =  pd.merge(fuelAndGeneration_2018_AggFuel_CoalNG,
                              fuelAndGeneration_2018_AggPlant,
                              left_on=['Plant ID','Month'], 
                              right_on=['Plant ID','Month'],
                              how = 'left',
                              suffixes =  ('', '_2'))

fuelType_AggPlant = fuelType_AggPlant[['Month','Plant ID', 'Reported Fuel Type Code', 
                                       'Net Generation (MWh)','Net Generation (MWh)_2']] 
#Keep fuels with generation greater than 0
fuelType_AggPlant = fuelType_AggPlant[fuelType_AggPlant['Net Generation (MWh)'] > 0] 
#Keep fuels that are more than 45% of total plant generation
fuelType_AggPlant = fuelType_AggPlant[(fuelType_AggPlant['Net Generation (MWh)'] / fuelType_AggPlant['Net Generation (MWh)_2']) > 0.30]
fuelType_AggPlant['% gen'] = (fuelType_AggPlant['Net Generation (MWh)'] / fuelType_AggPlant['Net Generation (MWh)_2']) * 100
#fuelType_AggPlant.to_csv('fuelType_AggPlant.csv')
#fuelType_AggPlant
In [32]:
# Create function for aggregting to a single fuel type
def aggFuelType(x):
    unique = []
    
    for val in x:
        if val not in unique:
            unique.append(val)
            
    unique.sort()
    
    if len(unique) == 1:
        unique = unique[0] 
    elif unique == ['COAL','NG']:
        unique = 'COAL_NG'
    else:
        unique = tuple(unique)
        
    return unique
In [33]:
#Aggregate by plant to find primary fuel type(s)
fuelType_AggPlant2 = fuelType_AggPlant.groupby(by = ['Plant ID','Month'],as_index=False).agg( {'Reported Fuel Type Code': [aggFuelType]})

fuelType_AggPlant2.columns = fuelType_AggPlant2.columns.droplevel(1)
#fuelType_AggPlant2['Reported Fuel Type Code'].value_counts()
In [34]:
# Merge plant emissions with fuel type
emissions_primaryFuelType =  pd.merge(fuelAndGeneration_2018_location,
                                        fuelType_AggPlant2, 
                                        left_on=['Plant ID','Month'], 
                                        right_on=['Plant ID','Month'],
                                        how = 'left',
                                        suffixes =  ('', '_2'))

emissions_primaryFuelType = emissions_primaryFuelType.dropna()
#emissions_primaryFuelType.to_csv('emissions_primaryFuelType.csv')
emissions_primaryFuelType.head()
Out[34]:
Month Plant ID Latitude Longitude Balancing Authority Code GHG Estimate (lbs) Net Generation (MWh) Reported Fuel Type Code
0 1 3 31.006900 -88.010300 SOCO 3.385828e+08 1267682.0 COAL_NG
1 1 7 34.012800 -85.970800 SOCO 4.145301e+06 14592.0 NG
2 1 8 33.644344 -87.196486 SOCO 2.683336e+08 665047.0 COAL
3 1 9 31.756900 -106.375000 EPE 1.432610e+06 3601.0 NG
4 1 10 32.601700 -87.781100 SOCO 2.123426e+07 72463.0 NG

Data Preparation Visualization 3: Census Data Preparation

  • Import median houshold income by census tract
  • Import census tract shapefiles
  • Merge income data with geometric data
In [35]:
# Import median household income by census tract
medianIncome_2018 = pd.read_csv('./Income Data/Median Income by Census Tract.csv',header=1)
medianIncome_2018 = medianIncome_2018[['id', 'Estimate!!Households!!Median income (dollars)']]
medianIncome_2018['GEOID'] = medianIncome_2018['id'].str.split('US', 1, expand=True)[1]
medianIncome_2018 = medianIncome_2018.rename(columns = {'Estimate!!Households!!Median income (dollars)': 'Median income (dollars)'})
medianIncome_2018 = medianIncome_2018[['GEOID', 'Median income (dollars)']]
medianIncome_2018.head()
Out[35]:
GEOID Median income (dollars)
0 01001020100 58625
1 01001020200 43531
2 01001020300 51875
3 01001020400 54050
4 01001020500 72417
In [36]:
# Import U.S. census tract shapefiles
censusTract_stateIDs = list(range(1,57))
removeVals = [3, 7, 14, 43, 52] # not US state or DC
nonContiguous = [2,15] # [Hawaii, Alaska]
censusTract_stateIDs = [i for i in censusTract_stateIDs if i not in removeVals]
censusTract_stateIDs_contiguous = [i for i in censusTract_stateIDs if i not in nonContiguous]

for n in censusTract_stateIDs_contiguous:
    if n < 10:
        tractID = '0'+str(n)
    else:
        tractID = str(n)
    
    tract_temp = gpd.read_file(f'./censusTracts_2018/tl_2018_{tractID}_tract/tl_2018_{tractID}_tract.shp')
    tract_temp = tract_temp[['GEOID','geometry']]
        
    if n == 1:
        US_tracts = tract_temp.copy()
    else:
        frames = [US_tracts,tract_temp]
        US_tracts = pd.concat(frames, sort=False)
       
US_tracts.plot("GEOID", cmap="Blues")  
US_tracts.head()
Out[36]:
GEOID geometry
0 01087232100 POLYGON ((-85.73401 32.40810, -85.73390 32.408...
1 01101002201 POLYGON ((-86.31642 32.34430, -86.31624 32.344...
2 01101002202 POLYGON ((-86.31681 32.32241, -86.31599 32.322...
3 01101002300 POLYGON ((-86.33477 32.32758, -86.33457 32.328...
4 01101002400 POLYGON ((-86.34781 32.32959, -86.34670 32.332...
In [37]:
# Merge Income data and tract geometry
incomeData_tractGeometry = US_tracts.merge(medianIncome_2018,
                                      left_on="GEOID",
                                      right_on='GEOID',
                                      how = 'left')
incomeData_tractGeometry.head()
Out[37]:
GEOID geometry Median income (dollars)
0 01087232100 POLYGON ((-85.73401 32.40810, -85.73390 32.408... 37098
1 01101002201 POLYGON ((-86.31642 32.34430, -86.31624 32.344... 26829
2 01101002202 POLYGON ((-86.31681 32.32241, -86.31599 32.322... 31276
3 01101002300 POLYGON ((-86.33477 32.32758, -86.33457 32.328... 26553
4 01101002400 POLYGON ((-86.34781 32.32959, -86.34670 32.332... 29250

Visualization 3: Map of Power Plants & Fuel Type and Median Household Income

  • What is the relationship between power plants (and subsequently emissions and pollution) and wealth?
In [38]:
# Change string values to values that can be changed to floats
incomeData_tractGeometry['Median income (dollars)'] = incomeData_tractGeometry['Median income (dollars)'].replace(['-', '250,000+', '2,500-'],['0', '250000', '2500'])
# Change all values to floats so natural breaks can classify
incomeData_tractGeometry['Median income (dollars)'] = [float(i) for i in incomeData_tractGeometry['Median income (dollars)']] 
# Change income tract data to World Mercator system
incomeData_tractGeometry.to_crs("EPSG:3395")
# Create figure and set figure limits
fig,ax = plt.subplots(figsize=(50,50))
ax.set_ylim([22.4, 54.2])
ax.set_xlim([-130.0, -60.4])
ax.set_title('Map of Power Plants over Median Household Income', fontsize=50)
# Plot choropleth map of median household income data
incomeData_tractGeometry.plot(ax=ax, cmap='Blues', column='Median income (dollars)',
                              legend=True, k=7, scheme='NaturalBreaks')
colors = ['#000000', '#4d9221', '#cb181d']
cmap = LinearSegmentedColormap.from_list('cm', colors, N=3)
# Create geodataframe from pandas dataframe
emissions = gpd.GeoDataFrame(emissions_primaryFuelType,
                            geometry=gpd.points_from_xy(emissions_primaryFuelType.Longitude,
                                                        emissions_primaryFuelType.Latitude)) 
# Set emissions coordinate system to World Mercator
emissions.set_crs("EPSG:3395")
# Plot locations of power plants over income data
emissions.plot(ax=ax, column='Reported Fuel Type Code', legend=True, cmap=cmap, markersize=150)
plt.show()

Visualization 4: Spatial visualization of BA-to-BA interchange

In [39]:
#import BA locations
BA_coords = pd.read_csv('./BA Geometry/Balancing Authority Coordinates 76.csv')
In [40]:
# Merge location data with transfer data
transfer_coords_1 =  pd.merge(transferData_2018,
                              BA_coords, 
                              left_on=['origin'], 
                              right_on=['BA Code'],
                              how = 'left',
                              suffixes =  ('', '_2'))
transfer_coords_2 =  pd.merge(transfer_coords_1,
                              BA_coords, 
                              left_on=['destination'], 
                              right_on=['BA Code'],
                              how = 'left',
                              suffixes =  ('', '_2'))
transfer_coords_2
Out[40]:
origin destination Year Month monthlyTransferMWh BA Code BA Name Region/Country Name Latitude Longitude Source BA Code_2 BA Name_2 Region/Country Name_2 Latitude_2 Longitude_2 Source_2
0 AEC MISO 2018 1 119447 AEC PowerSouth Energy Cooperative Southeast 31.729972 -86.338617 HIFLD MISO Midcontinent Independent System Operator, Inc. Midwest 41.449768 -92.857741 HIFLD
1 AEC MISO 2018 2 67662 AEC PowerSouth Energy Cooperative Southeast 31.729972 -86.338617 HIFLD MISO Midcontinent Independent System Operator, Inc. Midwest 41.449768 -92.857741 HIFLD
2 AEC MISO 2018 3 41530 AEC PowerSouth Energy Cooperative Southeast 31.729972 -86.338617 HIFLD MISO Midcontinent Independent System Operator, Inc. Midwest 41.449768 -92.857741 HIFLD
3 AEC MISO 2018 4 30003 AEC PowerSouth Energy Cooperative Southeast 31.729972 -86.338617 HIFLD MISO Midcontinent Independent System Operator, Inc. Midwest 41.449768 -92.857741 HIFLD
4 AEC MISO 2018 5 50136 AEC PowerSouth Energy Cooperative Southeast 31.729972 -86.338617 HIFLD MISO Midcontinent Independent System Operator, Inc. Midwest 41.449768 -92.857741 HIFLD
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
3241 YAD DUK 2018 8 148539 YAD Alcoa Power Generating, Inc. - Yadkin Division Carolinas 35.532017 -80.125356 HIFLD DUK Duke Energy Carolinas Carolinas 35.339984 -81.251092 HIFLD
3242 YAD DUK 2018 9 124077 YAD Alcoa Power Generating, Inc. - Yadkin Division Carolinas 35.532017 -80.125356 HIFLD DUK Duke Energy Carolinas Carolinas 35.339984 -81.251092 HIFLD
3243 YAD DUK 2018 10 150415 YAD Alcoa Power Generating, Inc. - Yadkin Division Carolinas 35.532017 -80.125356 HIFLD DUK Duke Energy Carolinas Carolinas 35.339984 -81.251092 HIFLD
3244 YAD DUK 2018 11 100513 YAD Alcoa Power Generating, Inc. - Yadkin Division Carolinas 35.532017 -80.125356 HIFLD DUK Duke Energy Carolinas Carolinas 35.339984 -81.251092 HIFLD
3245 YAD DUK 2018 12 121024 YAD Alcoa Power Generating, Inc. - Yadkin Division Carolinas 35.532017 -80.125356 HIFLD DUK Duke Energy Carolinas Carolinas 35.339984 -81.251092 HIFLD

3246 rows × 17 columns

In [41]:
fig = go.Figure()

# Map Connections
for i in range(len(transfer_coords_2)):
    fig.add_trace(
        go.Scattergeo(
                      locationmode = 'USA-states',
                      lon = [transfer_coords_2['Longitude'][i], transfer_coords_2['Longitude_2'][i]],
                      lat = [transfer_coords_2['Latitude'][i], transfer_coords_2['Latitude_2'][i]],
                      mode = 'lines',
                      line = dict(width = transfer_coords_2['monthlyTransferMWh'][i] / 400000,color = '#FCA311'),
                      #opacity = float(transfer_coords_2['average'][i]),
                      )
                   )

# Map BA points
fig.add_trace(go.Scattergeo(
              locationmode = "USA-states",
              lon = BA_coords['Longitude'],
              lat = BA_coords['Latitude'],
              hoverinfo = 'text',
              text = BA_coords['BA Code'],
              mode = 'markers',
              marker = dict(
                size = 4,
                color = '#4F518C',
                line = dict(
                    width = 3,
                    color = 'rgba(68, 68, 68, 0)'))))

# Update Layout of map
fig.update_layout(
    title_text = 'BA-to-BA transfers',
    showlegend = False,
    geo = dict(
                scope = 'usa',
                projection_type = "albers usa" ,
                showland = True,
                landcolor = 'rgb(243, 243, 243)',
                countrycolor = 'rgb(204, 204, 204)',
                showcountries=True,
                subunitcolor = 'black')
    
)

fig.show()

Project Roles

  • Jenni Nugent: Data cleaning and preparation; Visualization 2; Visualization 4
  • Guangyu Cai: Visualization 1
  • Amina Jackson: Visualization 2
  • Joe Franke: Visualization 3
In [ ]: