Food Access During the COVID-19 Pandemic

Author: Daoqin Tong, Xueting Jin

Date Created: 11/23/2021

This notebook provides the code for running a Segmented Linear Regression Model to examine visits to grocery stores in six large metropolitan statistical areas (MSAs).

Load Required Packages

In [1]:
import numpy as np
import pandas as pd
import statsmodels.api as sm

Load and Preview Data

In [2]:
food_access = pd.read_excel('Regression_Data.xlsx')
print(food_access.head(5))
          GeoID       Visits    AWDist  intD  TripChain_B  TripChain_D  \
0  170310101002  1756.663795  9.074393     0   606.279159          0.0   
1  170310101003  2197.925421  7.045779     0   698.586606          0.0   
2  170310102011  3897.613786  6.069707     0  1080.965566          0.0   
3  170310102012  4683.447118  7.227351     0  1486.399091          0.0   
4  170310102013  1573.585389  6.101111     0   355.207317          0.0   

   OtherR_Visits  Fast_Visits  Dist_Convenience  Dist_LargeRetailer  ...  \
0    3943.300326  1228.570038          0.292435            2.669511  ...   
1    5118.357246  1541.180055          0.180847            2.426438  ...   
2    4035.986211  1685.464689          0.155907            2.066019  ...   
3    8024.780492  3173.090726          0.284614            2.050767  ...   
4    2730.858675  1244.316341          0.207635            2.176084  ...   

   No_Vehicle_B  No_Vehicle_D  High_Educated_B  High_Educated_D  May  \
0     45.825427           0.0        37.233310              0.0    0   
1     54.147251           0.0        60.928571              0.0    0   
2     23.735955           0.0        40.533778              0.0    0   
3     34.691011           0.0        30.654545              0.0    0   
4     27.986907           0.0        27.089073              0.0    0   

   PCT_Case  POPDensityKsqmi  Commute  PCT_PUBLIC      MSA  
0       0.0           0.9935     34.4       12.36  Chicago  
1       0.0           0.9935     34.4       12.36  Chicago  
2       0.0           0.9935     34.4       12.36  Chicago  
3       0.0           0.9935     34.4       12.36  Chicago  
4       0.0           0.9935     34.4       12.36  Chicago  

[5 rows x 28 columns]

Defning the variables

image.png

In [3]:
y1 = food_access.iloc[:,1]
x1 = food_access.iloc[:,3:27]
x1 = sm.add_constant(x1)
In [4]:
y2 = food_access.iloc[:,2]
x2 = food_access.iloc[:,[3,4,5,8,9,10,11,12,14,15,16,17,18,19,20,21,22,23,24,25,26]]
x2 = sm.add_constant(x2)

Run Model 1

In [5]:
model1 = sm.OLS(y1,x1)
result1 = model1.fit()
print(result1.summary())
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                 Visits   R-squared:                       0.887
Model:                            OLS   Adj. R-squared:                  0.887
Method:                 Least Squares   F-statistic:                 4.526e+04
Date:                Wed, 24 Nov 2021   Prob (F-statistic):               0.00
Time:                        17:58:02   Log-Likelihood:            -1.1144e+06
No. Observations:              137991   AIC:                         2.229e+06
Df Residuals:                  137966   BIC:                         2.229e+06
Df Model:                          24                                         
Covariance Type:            nonrobust                                         
======================================================================================
                         coef    std err          t      P>|t|      [0.025      0.975]
--------------------------------------------------------------------------------------
const                651.9254     40.300     16.177      0.000     572.938     730.913
intD                 245.3612     11.289     21.734      0.000     223.235     267.488
TripChain_B            2.3147      0.004    535.185      0.000       2.306       2.323
TripChain_D            2.2311      0.006    394.149      0.000       2.220       2.242
OtherR_Visits         -0.0439      0.002    -21.235      0.000      -0.048      -0.040
Fast_Visits            0.3696      0.004     96.972      0.000       0.362       0.377
Dist_Convenience     108.0727      3.734     28.945      0.000     100.755     115.391
Dist_LargeRetailer   -17.4920      0.816    -21.446      0.000     -19.091     -15.893
Dist_Specialty        13.0355      0.616     21.159      0.000      11.828      14.243
Dist_Supermarket       4.5200      2.172      2.081      0.037       0.263       8.777
Dist_Warehouse        -5.0634      0.620     -8.167      0.000      -6.279      -3.848
Population             0.2511      0.004     65.698      0.000       0.244       0.259
PCT_17                 4.4685      0.294     15.189      0.000       3.892       5.045
PCT_65                 5.7481      0.243     23.677      0.000       5.272       6.224
PCT_race              -2.6667      0.094    -28.470      0.000      -2.850      -2.483
Income                 0.9692      0.083     11.710      0.000       0.807       1.131
No_Vehicle_B           0.1023      0.167      0.611      0.541      -0.226       0.430
No_Vehicle_D          -4.0589      0.180    -22.608      0.000      -4.411      -3.707
High_Educated_B       -0.7727      0.181     -4.268      0.000      -1.127      -0.418
High_Educated_D        2.1330      0.186     11.448      0.000       1.768       2.498
May                  -83.9182      4.464    -18.798      0.000     -92.668     -75.168
PCT_Case             -22.1611      0.902    -24.578      0.000     -23.928     -20.394
POPDensityKsqmi       59.1955      4.282     13.823      0.000      50.802      67.589
Commute              -39.0944      1.364    -28.672      0.000     -41.767     -36.422
PCT_PUBLIC            12.6462      0.352     35.898      0.000      11.956      13.337
==============================================================================
Omnibus:                    92918.776   Durbin-Watson:                   0.976
Prob(Omnibus):                  0.000   Jarque-Bera (JB):         17547130.540
Skew:                           2.241   Prob(JB):                         0.00
Kurtosis:                      58.062   Cond. No.                     6.90e+04
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The condition number is large, 6.9e+04. This might indicate that there are
strong multicollinearity or other numerical problems.

Run Model 2

In [6]:
model2 = sm.OLS(y2,x2)
result2 = model2.fit()
print(result2.summary())
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                 AWDist   R-squared:                       0.235
Model:                            OLS   Adj. R-squared:                  0.234
Method:                 Least Squares   F-statistic:                     2013.
Date:                Wed, 24 Nov 2021   Prob (F-statistic):               0.00
Time:                        17:58:03   Log-Likelihood:            -3.9549e+05
No. Observations:              137991   AIC:                         7.910e+05
Df Residuals:                  137969   BIC:                         7.912e+05
Df Model:                          21                                         
Covariance Type:            nonrobust                                         
======================================================================================
                         coef    std err          t      P>|t|      [0.025      0.975]
--------------------------------------------------------------------------------------
const                  6.6923      0.220     30.474      0.000       6.262       7.123
intD                  -1.3596      0.061    -22.288      0.000      -1.479      -1.240
TripChain_B         4.647e-05   2.06e-05      2.251      0.024    6.01e-06    8.69e-05
TripChain_D            0.0004   2.56e-05     15.561      0.000       0.000       0.000
Dist_Convenience       0.6984      0.020     34.375      0.000       0.659       0.738
Dist_LargeRetailer     0.0791      0.004     17.777      0.000       0.070       0.088
Dist_Specialty         0.1280      0.003     38.156      0.000       0.121       0.135
Dist_Supermarket       0.4054      0.012     34.171      0.000       0.382       0.429
Dist_Warehouse         0.1198      0.003     35.364      0.000       0.113       0.126
PCT_17                -0.0284      0.002    -17.846      0.000      -0.031      -0.025
PCT_65                -0.0188      0.001    -14.256      0.000      -0.021      -0.016
PCT_race              -0.0027      0.001     -5.265      0.000      -0.004      -0.002
Income                -0.0056      0.000    -12.521      0.000      -0.007      -0.005
No_Vehicle_B          -0.0099      0.001    -10.881      0.000      -0.012      -0.008
No_Vehicle_D           0.0082      0.001      8.476      0.000       0.006       0.010
High_Educated_B        0.0164      0.001     16.738      0.000       0.014       0.018
High_Educated_D        0.0088      0.001      8.746      0.000       0.007       0.011
May                    0.3634      0.024     14.970      0.000       0.316       0.411
PCT_Case               0.0869      0.005     17.746      0.000       0.077       0.096
POPDensityKsqmi       -0.3249      0.023    -14.042      0.000      -0.370      -0.280
Commute                0.0383      0.007      5.158      0.000       0.024       0.053
PCT_PUBLIC            -0.0047      0.002     -2.447      0.014      -0.008      -0.001
==============================================================================
Omnibus:                   139901.766   Durbin-Watson:                   1.631
Prob(Omnibus):                  0.000   Jarque-Bera (JB):         32497106.534
Skew:                           4.534   Prob(JB):                         0.00
Kurtosis:                      77.631   Cond. No.                     1.49e+04
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The condition number is large, 1.49e+04. This might indicate that there are
strong multicollinearity or other numerical problems.