第7章操作変数法

先に出版社サイトよりデータをダウンロードする.

# サポートファイルへのリンク
curl <- "https://www.yuhikaku.co.jp/static_files/05385_support07.zip"
# ダウンロード保存用フォルダが存在しない場合, 作成
if(!dir.exists("downloads")){
    dir.create("downloads")
}
cdestfile <- "downloads/support07.zip"
download.file(curl, cdestfile)
# データ保存用フォルダが存在しない場合, 作成
if(!dir.exists("data")){
    dir.create("data")
}
# WSL上のRで解凍すると文字化けするので、Linuxのコマンドを外部呼び出し
# Windowsの場合は別途コマンドを用いる.
if(.Platform$OS.type == "unix") {
    system(sprintf('unzip -n -Ocp932 %s -d %s', "downloads/support07.zip", "./data"))
} else {
    print("Windowsで解凍するコマンドを別途追加せよ.")
}

必要なライブラリを読み込む.

library(tidyverse)
library(estimatr)
library(AER)
## Loading required package: lmtest
## Loading required package: zoo
## 
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
## Loading required package: sandwich
## Loading required package: survival
library(kableExtra)

実証例7.1 単回帰モデルの操作変数推定

まずはデータを読み込む.

ipehd_qje2009_master <- read.csv("data/07_第7章/ipehd_qje2009_master.csv")

操作変数法はestimatr::iv_robust()やAER::ivreg()などで実行できる.

lm_robust(f_rw ~ f_prot, se_type = "stata", data = ipehd_qje2009_master) %>% summary()
## 
## Call:
## lm_robust(formula = f_rw ~ f_prot, data = ipehd_qje2009_master, 
##     se_type = "stata")
## 
## Standard error type:  HC1 
## 
## Coefficients:
##             Estimate Std. Error t value   Pr(>|t|) CI Lower CI Upper  DF
## (Intercept) 82.37419    1.41304  58.296 7.716e-212 79.59721  85.1512 450
## f_prot       0.07998    0.01611   4.964  9.831e-07  0.04831   0.1116 450
## 
## Multiple R-squared:  0.057 , Adjusted R-squared:  0.05491 
## F-statistic: 24.64 on 1 and 450 DF,  p-value: 9.831e-07
iv_robust(f_rw ~ f_prot | kmwittenberg, se_type = "stata", data = ipehd_qje2009_master) %>% summary()
## 
## Call:
## iv_robust(formula = f_rw ~ f_prot | kmwittenberg, data = ipehd_qje2009_master, 
##     se_type = "stata")
## 
## Standard error type:  HC1 
## 
## Coefficients:
##             Estimate Std. Error t value  Pr(>|t|) CI Lower CI Upper  DF
## (Intercept)  60.4509    5.11310   11.82 2.836e-28  50.4024  70.4994 450
## f_prot        0.4216    0.07109    5.93 6.050e-09   0.2819   0.5613 450
## 
## Multiple R-squared:  -0.9827 ,   Adjusted R-squared:  -0.9871 
## F-statistic: 35.16 on 1 and 450 DF,  p-value: 6.05e-09
ivreg(f_rw ~ f_prot | kmwittenberg, data = ipehd_qje2009_master) %>% summary(vcov = sandwich::sandwich)
## 
## Call:
## ivreg(formula = f_rw ~ f_prot | kmwittenberg, data = ipehd_qje2009_master)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -44.342  -9.772  -4.798  11.434  37.559 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 60.45090    5.10177  11.849  < 2e-16 ***
## f_prot       0.42157    0.07094   5.943 5.62e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 17.87 on 450 degrees of freedom
## Multiple R-Squared: -0.9827, Adjusted R-squared: -0.9871 
## Wald test: 35.32 on 1 and 450 DF,  p-value: 5.616e-09

実証例7.2 操作変数推定量の標準誤差

実証例7.1を参照せよ.

実証例7.3 19世紀プロイセンのデータの外生性を含めた2SLS推定による分析

1段階目, 2段階目の両方にコントロール変数を追加する.

iv_robust(f_rw ~ f_prot + f_young + f_jew + f_fem + f_ortsgeb + f_pruss + hhsize + lnpop + gpop + f_blind + f_deaf + f_dumb + f_miss | kmwittenberg + f_young + f_jew + f_fem + f_ortsgeb + f_pruss + hhsize + lnpop + gpop + f_blind + f_deaf + f_dumb + f_miss, se_type = "stata", data = ipehd_qje2009_master) %>% summary()
## 
## Call:
## iv_robust(formula = f_rw ~ f_prot + f_young + f_jew + f_fem + 
##     f_ortsgeb + f_pruss + hhsize + lnpop + gpop + f_blind + f_deaf + 
##     f_dumb + f_miss | kmwittenberg + f_young + f_jew + f_fem + 
##     f_ortsgeb + f_pruss + hhsize + lnpop + gpop + f_blind + f_deaf + 
##     f_dumb + f_miss, data = ipehd_qje2009_master, se_type = "stata")
## 
## Standard error type:  HC1 
## 
## Coefficients:
##             Estimate Std. Error  t value  Pr(>|t|) CI Lower  CI Upper  DF
## (Intercept) 177.3665   29.49620   6.0132 3.838e-09 119.3948 235.33821 438
## f_prot        0.1885    0.02707   6.9640 1.217e-11   0.1353   0.24170 438
## f_young      -1.9524    0.16773 -11.6400 1.806e-27  -2.2820  -1.62271 438
## f_jew        -0.4367    0.34981  -1.2483 2.126e-01  -1.1242   0.25084 438
## f_fem        -1.0726    0.33087  -3.2419 1.278e-03  -1.7229  -0.42234 438
## f_ortsgeb     0.6066    0.05177  11.7170 9.056e-28   0.5048   0.70835 438
## f_pruss      -0.1807    0.14177  -1.2746 2.031e-01  -0.4593   0.09794 438
## hhsize        0.8849    1.64334   0.5385 5.905e-01  -2.3449   4.11475 438
## lnpop        -1.3181    0.95739  -1.3768 1.693e-01  -3.1998   0.56351 438
## gpop          0.4099    0.10914   3.7558 1.961e-04   0.1954   0.62441 438
## f_blind     -27.8647   14.98451  -1.8596 6.362e-02 -57.3152   1.58580 438
## f_deaf      -52.3831   10.41183  -5.0311 7.130e-07 -72.8464 -31.91972 438
## f_dumb        7.5290    1.70627   4.4125 1.288e-05   4.1755  10.88248 438
## f_miss       -0.5049    0.37778  -1.3366 1.821e-01  -1.2474   0.23755 438
## 
## Multiple R-squared:  0.6886 ,    Adjusted R-squared:  0.6794 
## F-statistic: 66.57 on 13 and 438 DF,  p-value: < 2.2e-16
ivreg(f_rw ~ f_prot + f_young + f_jew + f_fem + f_ortsgeb + f_pruss + hhsize + lnpop + gpop + f_blind + f_deaf + f_dumb + f_miss | kmwittenberg + f_young + f_jew + f_fem + f_ortsgeb + f_pruss + hhsize + lnpop + gpop + f_blind + f_deaf + f_dumb + f_miss, data = ipehd_qje2009_master) %>% summary(vcov = sandwich::sandwich)
## 
## Call:
## ivreg(formula = f_rw ~ f_prot + f_young + f_jew + f_fem + f_ortsgeb + 
##     f_pruss + hhsize + lnpop + gpop + f_blind + f_deaf + f_dumb + 
##     f_miss | kmwittenberg + f_young + f_jew + f_fem + f_ortsgeb + 
##     f_pruss + hhsize + lnpop + gpop + f_blind + f_deaf + f_dumb + 
##     f_miss, data = ipehd_qje2009_master)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -21.5658  -4.4133  -0.1371   4.7447  19.9628 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 177.36653   29.03581   6.109 2.22e-09 ***
## f_prot        0.18850    0.02665   7.074 5.98e-12 ***
## f_young      -1.95236    0.16511 -11.825  < 2e-16 ***
## f_jew        -0.43667    0.34435  -1.268 0.205434    
## f_fem        -1.07263    0.32570  -3.293 0.001071 ** 
## f_ortsgeb     0.60660    0.05096  11.903  < 2e-16 ***
## f_pruss      -0.18070    0.13956  -1.295 0.196067    
## hhsize        0.88494    1.61769   0.547 0.584630    
## lnpop        -1.31814    0.94245  -1.399 0.162630    
## gpop          0.40991    0.10744   3.815 0.000156 ***
## f_blind     -27.86468   14.75062  -1.889 0.059545 .  
## f_deaf      -52.38308   10.24932  -5.111 4.80e-07 ***
## f_dumb        7.52898    1.67964   4.483 9.43e-06 ***
## f_miss       -0.50494    0.37188  -1.358 0.175231    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7.176 on 438 degrees of freedom
## Multiple R-Squared: 0.6886,  Adjusted R-squared: 0.6794 
## Wald test: 68.69 on 13 and 438 DF,  p-value: < 2.2e-16

iv_robust(f_rw ~ f_prot + f_young + f_jew + f_fem + f_ortsgeb + f_pruss + hhsize + lnpop + gpop + f_blind + f_deaf + f_dumb + f_miss | kmwittenberg * (lnpop + gpop) + f_young + f_jew + f_fem + f_ortsgeb + f_pruss + hhsize + f_blind + f_deaf + f_dumb + f_miss, se_type = "stata", data = ipehd_qje2009_master) %>% summary()
## 
## Call:
## iv_robust(formula = f_rw ~ f_prot + f_young + f_jew + f_fem + 
##     f_ortsgeb + f_pruss + hhsize + lnpop + gpop + f_blind + f_deaf + 
##     f_dumb + f_miss | kmwittenberg * (lnpop + gpop) + f_young + 
##     f_jew + f_fem + f_ortsgeb + f_pruss + hhsize + f_blind + 
##     f_deaf + f_dumb + f_miss, data = ipehd_qje2009_master, se_type = "stata")
## 
## Standard error type:  HC1 
## 
## Coefficients:
##             Estimate Std. Error t value  Pr(>|t|) CI Lower  CI Upper  DF
## (Intercept) 179.1507   29.13265   6.149 1.753e-09 121.8935 236.40788 438
## f_prot        0.1853    0.02572   7.205 2.553e-12   0.1348   0.23591 438
## f_young      -1.9518    0.16695 -11.691 1.145e-27  -2.2799  -1.62366 438
## f_jew        -0.4553    0.34569  -1.317 1.885e-01  -1.1348   0.22406 438
## f_fem        -1.0800    0.33013  -3.271 1.155e-03  -1.7288  -0.43113 438
## f_ortsgeb     0.6023    0.04980  12.093 2.996e-29   0.5044   0.70016 438
## f_pruss      -0.1858    0.14074  -1.320 1.876e-01  -0.4624   0.09085 438
## hhsize        0.7897    1.60849   0.491 6.237e-01  -2.3716   3.95102 438
## lnpop        -1.3133    0.95155  -1.380 1.682e-01  -3.1835   0.55683 438
## gpop          0.4020    0.10650   3.775 1.823e-04   0.1927   0.61133 438
## f_blind     -28.4769   14.69031  -1.938 5.321e-02 -57.3492   0.39532 438
## f_deaf      -52.1996   10.37314  -5.032 7.092e-07 -72.5869 -31.81228 438
## f_dumb        7.5271    1.68652   4.463 1.029e-05   4.2124  10.84175 438
## f_miss       -0.4979    0.37693  -1.321 1.872e-01  -1.2388   0.24288 438
## 
## Multiple R-squared:  0.692 , Adjusted R-squared:  0.6828 
## F-statistic: 66.54 on 13 and 438 DF,  p-value: < 2.2e-16
ivreg(f_rw ~ f_prot + f_young + f_jew + f_fem + f_ortsgeb + f_pruss + hhsize + lnpop + gpop + f_blind + f_deaf + f_dumb + f_miss | kmwittenberg * (lnpop + gpop) + f_young + f_jew + f_fem + f_ortsgeb + f_pruss + hhsize + f_blind + f_deaf + f_dumb + f_miss, data = ipehd_qje2009_master) %>% summary(vcov = sandwich::sandwich)
## 
## Call:
## ivreg(formula = f_rw ~ f_prot + f_young + f_jew + f_fem + f_ortsgeb + 
##     f_pruss + hhsize + lnpop + gpop + f_blind + f_deaf + f_dumb + 
##     f_miss | kmwittenberg * (lnpop + gpop) + f_young + f_jew + 
##     f_fem + f_ortsgeb + f_pruss + hhsize + f_blind + f_deaf + 
##     f_dumb + f_miss, data = ipehd_qje2009_master)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -21.4933  -4.4063  -0.1173   4.6397  19.7299 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 179.15072   28.67794   6.247 9.92e-10 ***
## f_prot        0.18535    0.02532   7.319 1.20e-12 ***
## f_young      -1.95178    0.16434 -11.876  < 2e-16 ***
## f_jew        -0.45535    0.34029  -1.338 0.181553    
## f_fem        -1.07995    0.32497  -3.323 0.000965 ***
## f_ortsgeb     0.60228    0.04903  12.285  < 2e-16 ***
## f_pruss      -0.18577    0.13855  -1.341 0.180673    
## hhsize        0.78970    1.58339   0.499 0.618213    
## lnpop        -1.31335    0.93670  -1.402 0.161593    
## gpop          0.40201    0.10484   3.835 0.000144 ***
## f_blind     -28.47694   14.46102  -1.969 0.049558 *  
## f_deaf      -52.19960   10.21123  -5.112 4.77e-07 ***
## f_dumb        7.52707    1.66020   4.534 7.49e-06 ***
## f_miss       -0.49794    0.37105  -1.342 0.180298    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7.138 on 438 degrees of freedom
## Multiple R-Squared: 0.692,   Adjusted R-squared: 0.6828 
## Wald test: 68.66 on 13 and 438 DF,  p-value: < 2.2e-16

実証例7.4 操作変数の強さの判定

lm_robust(f_prot ~ kmwittenberg + f_young + f_jew + f_fem + f_ortsgeb + f_pruss + hhsize + lnpop + gpop + f_blind + f_deaf + f_dumb + f_miss, se_type = "stata", data = ipehd_qje2009_master) %>% summary()
## 
## Call:
## lm_robust(formula = f_prot ~ kmwittenberg + f_young + f_jew + 
##     f_fem + f_ortsgeb + f_pruss + hhsize + lnpop + gpop + f_blind + 
##     f_deaf + f_dumb + f_miss, data = ipehd_qje2009_master, se_type = "stata")
## 
## Standard error type:  HC1 
## 
## Coefficients:
##               Estimate Std. Error  t value  Pr(>|t|)  CI Lower  CI Upper  DF
## (Intercept)  478.89583   119.4917   4.0078 7.203e-05  244.0475 713.74420 438
## kmwittenberg  -0.09469     0.0114  -8.3036 1.270e-15   -0.1171  -0.07228 438
## f_young        0.20509     0.7650   0.2681 7.888e-01   -1.2984   1.70859 438
## f_jew         -7.26355     1.2312  -5.8998 7.292e-09   -9.6833  -4.84383 438
## f_fem         -0.55734     1.3016  -0.4282 6.687e-01   -3.1155   2.00086 438
## f_ortsgeb     -1.39007     0.1331 -10.4462 5.882e-23   -1.6516  -1.12853 438
## f_pruss       -1.93455     0.7897  -2.4497 1.469e-02   -3.4867  -0.38244 438
## hhsize       -14.61046     5.9310  -2.4634 1.415e-02  -26.2672  -2.95374 438
## lnpop         -0.97709     3.9664  -0.2463 8.055e-01   -8.7727   6.81850 438
## gpop          -1.96156     0.3261  -6.0147 3.806e-09   -2.6025  -1.32059 438
## f_blind      -82.26611    57.0172  -1.4428 1.498e-01 -194.3274  29.79517 438
## f_deaf       104.61556    31.5326   3.3177 9.833e-04   42.6415 166.58965 438
## f_dumb        -2.31377     8.7324  -0.2650 7.912e-01  -19.4764  14.84886 438
## f_miss         1.72902     1.4842   1.1649 2.447e-01   -1.1880   4.64609 438
## 
## Multiple R-squared:  0.4194 ,    Adjusted R-squared:  0.4021 
## F-statistic: 40.82 on 13 and 438 DF,  p-value: < 2.2e-16

lm_robust(f_prot ~ kmwittenberg * (lnpop + gpop) + f_young + f_jew + f_fem + f_ortsgeb + f_pruss + hhsize + f_blind + f_deaf + f_dumb + f_miss, se_type = "stata", data = ipehd_qje2009_master) %>% summary()
## 
## Call:
## lm_robust(formula = f_prot ~ kmwittenberg * (lnpop + gpop) + 
##     f_young + f_jew + f_fem + f_ortsgeb + f_pruss + hhsize + 
##     f_blind + f_deaf + f_dumb + f_miss, data = ipehd_qje2009_master, 
##     se_type = "stata")
## 
## Standard error type:  HC1 
## 
## Coefficients:
##                      Estimate Std. Error  t value  Pr(>|t|)   CI Lower
## (Intercept)        649.639939  1.412e+02   4.6014 5.510e-06  3.722e+02
## kmwittenberg        -0.580762  2.998e-01  -1.9375 5.334e-02 -1.170e+00
## lnpop              -16.415951  8.621e+00  -1.9041 5.755e-02 -3.336e+01
## gpop                 0.726697  7.633e-01   0.9520 3.416e-01 -7.736e-01
## f_young              0.613596  8.027e-01   0.7644 4.450e-01 -9.641e-01
## f_jew               -7.053875  1.235e+00  -5.7134 2.054e-08 -9.480e+00
## f_fem               -0.740260  1.301e+00  -0.5692 5.695e-01 -3.296e+00
## f_ortsgeb           -1.377103  1.315e-01 -10.4708 4.894e-23 -1.636e+00
## f_pruss             -1.927401  7.300e-01  -2.6401 8.585e-03 -3.362e+00
## hhsize             -16.275257  5.844e+00  -2.7852 5.583e-03 -2.776e+01
## f_blind            -87.656566  5.671e+01  -1.5457 1.229e-01 -1.991e+02
## f_deaf              91.641755  3.220e+01   2.8457 4.641e-03  2.835e+01
## f_dumb              -2.015459  8.753e+00  -0.2303 8.180e-01 -1.922e+01
## f_miss               2.068423  1.493e+00   1.3858 1.665e-01 -8.651e-01
## kmwittenberg:lnpop   0.045825  2.780e-02   1.6486 9.995e-02 -8.806e-03
## kmwittenberg:gpop   -0.007717  2.129e-03  -3.6249 3.232e-04 -1.190e-02
##                      CI Upper  DF
## (Intercept)        927.123943 436
## kmwittenberg         0.008384 436
## lnpop                0.528377 436
## gpop                 2.226960 436
## f_young              2.191285 436
## f_jew               -4.627328 436
## f_fem                1.815939 436
## f_ortsgeb           -1.118613 436
## f_pruss             -0.492568 436
## hhsize              -4.790276 436
## f_blind             23.798822 436
## f_deaf             154.935818 436
## f_dumb              15.187240 436
## f_miss               5.001929 436
## kmwittenberg:lnpop   0.100455 436
## kmwittenberg:gpop   -0.003533 436
## 
## Multiple R-squared:  0.4307 ,    Adjusted R-squared:  0.4111 
## F-statistic: 34.26 on 15 and 436 DF,  p-value: < 2.2e-16

実証例7.5 操作変数の外生性の検定

ここでは均一分散用の標準誤差で計算する. HansenのJ検定は, iv_robust()でdiagnostics = TRUEとすると, summary()のOveridentifyingの欄に表示される. (ivreg()でもsummary(diagnostics = TRUE)とすると同様にできるはずだが, 同じ値が得られなかった.)

model_73_1 <- ivreg(f_rw ~ f_prot + f_young + f_jew + f_fem + f_ortsgeb + f_pruss + hhsize + lnpop + gpop + f_blind + f_deaf + f_dumb + f_miss | kmwittenberg + f_young + f_jew + f_fem + f_ortsgeb + f_pruss + hhsize + lnpop + gpop + f_blind + f_deaf + f_dumb + f_miss, data = ipehd_qje2009_master)
model_75_1 <- lm(model_73_1$residuals ~ kmwittenberg + f_young + f_jew + f_fem + f_ortsgeb + f_pruss + hhsize + lnpop + gpop + f_blind + f_deaf + f_dumb + f_miss, data = ipehd_qje2009_master)
summary(model_75_1)
## 
## Call:
## lm(formula = model_73_1$residuals ~ kmwittenberg + f_young + 
##     f_jew + f_fem + f_ortsgeb + f_pruss + hhsize + lnpop + gpop + 
##     f_blind + f_deaf + f_dumb + f_miss, data = ipehd_qje2009_master)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -21.5658  -4.4133  -0.1371   4.7447  19.9628 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)
## (Intercept)   3.329e-12  2.935e+01       0        1
## kmwittenberg  1.454e-16  2.697e-03       0        1
## f_young       4.452e-15  1.723e-01       0        1
## f_jew         6.018e-15  3.047e-01       0        1
## f_fem        -2.790e-14  3.294e-01       0        1
## f_ortsgeb    -1.756e-16  3.294e-02       0        1
## f_pruss      -1.257e-14  1.967e-01       0        1
## hhsize       -9.685e-14  1.417e+00       0        1
## lnpop        -2.678e-14  9.525e-01       0        1
## gpop         -4.846e-15  9.918e-02       0        1
## f_blind       1.364e-15  1.252e+01       0        1
## f_deaf        9.140e-14  8.272e+00       0        1
## f_dumb       -2.833e-14  2.069e+00       0        1
## f_miss        3.208e-15  3.479e-01       0        1
## 
## Residual standard error: 7.176 on 438 degrees of freedom
## Multiple R-squared:  3.428e-29,  Adjusted R-squared:  -0.02968 
## F-statistic: 1.155e-27 on 13 and 438 DF,  p-value: 1
hypothesis <- c("kmwittenberg", "lnpop", "gpop", "f_young", "f_jew", "f_fem", "f_ortsgeb", "f_pruss", "hhsize", "f_blind", "f_deaf", "f_dumb", "f_miss")
linearHypothesis(model_75_1, hypothesis, rep(0, length(hypothesis)), "Chisq")
## 
## Linear hypothesis test:
## kmwittenberg = 0
## lnpop = 0
## gpop = 0
## f_young = 0
## f_jew = 0
## f_fem = 0
## f_ortsgeb = 0
## f_pruss = 0
## hhsize = 0
## f_blind = 0
## f_deaf = 0
## f_dumb = 0
## f_miss = 0
## 
## Model 1: restricted model
## Model 2: model_73_1$residuals ~ kmwittenberg + f_young + f_jew + f_fem + 
##     f_ortsgeb + f_pruss + hhsize + lnpop + gpop + f_blind + f_deaf + 
##     f_dumb + f_miss
## 
##   Res.Df   RSS Df Sum of Sq Chisq Pr(>Chisq)
## 1    451 22556                              
## 2    438 22556 13         0     0          1

model_73_2 <- ivreg(f_rw ~ f_prot + f_young + f_jew + f_fem + f_ortsgeb + f_pruss + hhsize + lnpop + gpop + f_blind + f_deaf + f_dumb + f_miss | kmwittenberg * (lnpop + gpop) + f_young + f_jew + f_fem + f_ortsgeb + f_pruss + hhsize + f_blind + f_deaf + f_dumb + f_miss, data = ipehd_qje2009_master)
model_75_2 <- lm(model_73_2$residuals ~ kmwittenberg * (lnpop + gpop) + f_young + f_jew + f_fem + f_ortsgeb + f_pruss + hhsize + f_blind + f_deaf + f_dumb + f_miss, data = ipehd_qje2009_master)
summary(model_75_2)
## 
## Call:
## lm(formula = model_73_2$residuals ~ kmwittenberg * (lnpop + gpop) + 
##     f_young + f_jew + f_fem + f_ortsgeb + f_pruss + hhsize + 
##     f_blind + f_deaf + f_dumb + f_miss, data = ipehd_qje2009_master)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -21.3447  -4.3890  -0.1007   4.6353  19.7419 
## 
## Coefficients:
##                      Estimate Std. Error t value Pr(>|t|)
## (Intercept)        -1.193e+01  3.738e+01  -0.319    0.750
## kmwittenberg        3.506e-02  7.091e-02   0.494    0.621
## lnpop               1.034e+00  2.285e+00   0.452    0.651
## gpop               -5.554e-02  2.489e-01  -0.223    0.824
## f_young             6.741e-04  1.778e-01   0.004    0.997
## f_jew              -5.388e-03  3.044e-01  -0.018    0.986
## f_fem               1.195e-02  3.287e-01   0.036    0.971
## f_ortsgeb          -5.907e-04  3.285e-02  -0.018    0.986
## f_pruss            -1.984e-03  1.960e-01  -0.010    0.992
## hhsize              8.812e-02  1.419e+00   0.062    0.951
## f_blind             8.972e-01  1.253e+01   0.072    0.943
## f_deaf              4.100e-01  8.318e+00   0.049    0.961
## f_dumb             -2.080e-02  2.062e+00  -0.010    0.992
## f_miss             -1.432e-02  3.480e-01  -0.041    0.967
## kmwittenberg:lnpop -3.290e-03  6.587e-03  -0.499    0.618
## kmwittenberg:gpop   1.663e-04  6.549e-04   0.254    0.800
## 
## Residual standard error: 7.152 on 436 degrees of freedom
## Multiple R-squared:  0.0006048,  Adjusted R-squared:  -0.03378 
## F-statistic: 0.01759 on 15 and 436 DF,  p-value: 1
hypothesis <- c("kmwittenberg", "lnpop", "gpop", "kmwittenberg:lnpop", "kmwittenberg:gpop", "f_young", "f_jew", "f_fem", "f_ortsgeb", "f_pruss", "hhsize", "f_blind", "f_deaf", "f_dumb", "f_miss")
linearHypothesis(model_75_2, hypothesis, rep(0, length(hypothesis)), "Chisq")
## 
## Linear hypothesis test:
## kmwittenberg = 0
## lnpop = 0
## gpop = 0
## kmwittenberg:lnpop = 0
## kmwittenberg:gpop = 0
## f_young = 0
## f_jew = 0
## f_fem = 0
## f_ortsgeb = 0
## f_pruss = 0
## hhsize = 0
## f_blind = 0
## f_deaf = 0
## f_dumb = 0
## f_miss = 0
## 
## Model 1: restricted model
## Model 2: model_73_2$residuals ~ kmwittenberg * (lnpop + gpop) + f_young + 
##     f_jew + f_fem + f_ortsgeb + f_pruss + hhsize + f_blind + 
##     f_deaf + f_dumb + f_miss
## 
##   Res.Df   RSS Df Sum of Sq  Chisq Pr(>Chisq)
## 1    451 22314                               
## 2    436 22300 15    13.495 0.2638          1
qchisq(0.9, 2)
## [1] 4.60517

# HansenのJ検定
iv_robust(f_rw ~ f_prot + f_young + f_jew + f_fem + f_ortsgeb + f_pruss + hhsize + lnpop + gpop + f_blind + f_deaf + f_dumb + f_miss | kmwittenberg * (lnpop + gpop) + f_young + f_jew + f_fem + f_ortsgeb + f_pruss + hhsize + f_blind + f_deaf + f_dumb + f_miss, se_type = "stata", data = ipehd_qje2009_master, diagnostics = T) %>% summary()
## 
## Call:
## iv_robust(formula = f_rw ~ f_prot + f_young + f_jew + f_fem + 
##     f_ortsgeb + f_pruss + hhsize + lnpop + gpop + f_blind + f_deaf + 
##     f_dumb + f_miss | kmwittenberg * (lnpop + gpop) + f_young + 
##     f_jew + f_fem + f_ortsgeb + f_pruss + hhsize + f_blind + 
##     f_deaf + f_dumb + f_miss, data = ipehd_qje2009_master, se_type = "stata", 
## 
## Standard error type:  HC1 
## 
## Coefficients:
##             Estimate Std. Error t value  Pr(>|t|) CI Lower  CI Upper  DF
## (Intercept) 179.1507   29.13265   6.149 1.753e-09 121.8935 236.40788 438
## f_prot        0.1853    0.02572   7.205 2.553e-12   0.1348   0.23591 438
## f_young      -1.9518    0.16695 -11.691 1.145e-27  -2.2799  -1.62366 438
## f_jew        -0.4553    0.34569  -1.317 1.885e-01  -1.1348   0.22406 438
## f_fem        -1.0800    0.33013  -3.271 1.155e-03  -1.7288  -0.43113 438
## f_ortsgeb     0.6023    0.04980  12.093 2.996e-29   0.5044   0.70016 438
## f_pruss      -0.1858    0.14074  -1.320 1.876e-01  -0.4624   0.09085 438
## hhsize        0.7897    1.60849   0.491 6.237e-01  -2.3716   3.95102 438
## lnpop        -1.3133    0.95155  -1.380 1.682e-01  -3.1835   0.55683 438
## gpop          0.4020    0.10650   3.775 1.823e-04   0.1927   0.61133 438
## f_blind     -28.4769   14.69031  -1.938 5.321e-02 -57.3492   0.39532 438
## f_deaf      -52.1996   10.37314  -5.032 7.092e-07 -72.5869 -31.81228 438
## f_dumb        7.5271    1.68652   4.463 1.029e-05   4.2124  10.84175 438
## f_miss       -0.4979    0.37693  -1.321 1.872e-01  -1.2388   0.24288 438
## 
## Multiple R-squared:  0.692 , Adjusted R-squared:  0.6828 
## F-statistic: 66.54 on 13 and 438 DF,  p-value: < 2.2e-16
## 
## Diagnostics:
##                  numdf dendf  value  p.value    
## Weak instruments     3   436 29.794  < 2e-16 ***
## Wu-Hausman           1   437 14.325 0.000175 ***
## Overidentifying      2    NA  0.382 0.826014    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# 異なる値になる?
#summary(model_73_2, diagnostics = TRUE)

表7-2 記述統計量

やはり第5章の表5-5や第6章の表6-3と同様にdatasummary()を用いてデータフレームを書き出し, 適宜リネームを行えばよい.

# 変数を選択
vars <- ipehd_qje2009_master %>%
    select(f_rw, f_prot, kmwittenberg, f_young, f_jew, f_fem, f_ortsgeb, f_pruss, hhsize, lnpop, gpop, f_blind, f_deaf, f_dumb, f_miss)
table63 <- datasummary(All(vars) ~ N + Mean + SD + Min + Max,
            output = "data.frame",
            data = ipehd_qje2009_master,
            fmt = 3)
# 列名
colnames(table63) <- c("変数", "サンプルサイズ", "平均", "標準偏差", "最小値", "最大値")
# 変数名
table63[,1] <- c("識字率", "新教徒率", "距離", "子供率", "ユダヤ率", "女性率", "出身者率", "普人率", "平均家計人数", "対数人口", "人口成長率", "視覚障害率", "聴覚障害率", "知的・精神障害率", "欠落率")
# 表を出力
gt(table63)

変数	サンプルサイズ	平均	標準偏差	最小値	最大値
識字率	452	87.507	12.673	37.397	99.332
新教徒率	452	64.181	37.831	0.258	99.889
距離	452	326.185	148.769	0.000	731.460
子供率	452	24.707	2.478	15.334	29.868
ユダヤ率	452	1.139	1.327	0.000	12.869
女性率	452	51.003	1.507	43.969	54.627
出身者率	452	58.970	12.386	32.009	87.230
普人率	452	99.072	1.966	74.215	99.995
平均家計人数	452	4.791	0.344	3.826	5.861
対数人口	452	10.804	0.415	9.360	13.625
人口成長率	452	1.595	4.932	-7.762	33.833
視覚障害率	452	0.095	0.031	0.034	0.238
聴覚障害率	452	0.101	0.054	0.022	0.421
知的・精神障害率	452	0.229	0.174	0.022	1.558
欠落率	452	1.689	1.105	0.000	6.720

表7-3 推定結果

(6)の標準誤差がStataと一致していないため注意. modelsummaryの出力はkableExtra, gtなどで整形することができるが, 前者が数式との相性がよい.

models_73 <- list("(1)" = lm_robust(f_rw ~ f_prot, se_type = "stata", data = ipehd_qje2009_master),
                  "(2)" = iv_robust(f_rw ~ f_prot | kmwittenberg, se_type = "stata", data = ipehd_qje2009_master, diagnostics = TRUE),
                  "(3)" = lm_robust(f_rw ~ f_prot + f_young + f_jew + f_fem + f_ortsgeb + f_pruss + hhsize + lnpop + gpop + f_blind + f_deaf + f_dumb + f_miss, se_type = "stata", data = ipehd_qje2009_master),
                  "(4)" = iv_robust(f_rw ~ f_prot + f_young + f_jew + f_fem + f_ortsgeb + f_pruss + hhsize + lnpop + gpop + f_blind + f_deaf + f_dumb + f_miss | kmwittenberg + f_young + f_jew + f_fem + f_ortsgeb + f_pruss + hhsize + lnpop + gpop + f_blind + f_deaf + f_dumb + f_miss, se_type = "stata", data = ipehd_qje2009_master, diagnostics = TRUE),
                  "(5)" = lm_robust(f_prot ~ kmwittenberg + f_young + f_jew + f_fem + f_ortsgeb + f_pruss + hhsize + lnpop + gpop + f_blind + f_deaf + f_dumb + f_miss, se_type = "stata", data = ipehd_qje2009_master),
                  "(6)" = iv_robust(f_rw ~ f_prot + f_young + f_jew + f_fem + f_ortsgeb + f_pruss + hhsize + lnpop + gpop + f_blind + f_deaf + f_dumb + f_miss | kmwittenberg * (lnpop + gpop) + f_young + f_jew + f_fem + f_ortsgeb + f_pruss + hhsize + f_blind + f_deaf + f_dumb + f_miss, se_type = "HC1", data = ipehd_qje2009_master, diagnostics = TRUE))
cm <- c("f_rw" = "識字率",
        "f_prot" = "新教徒率",
        "kmwittenberg" = "距離",
        "f_young" = "子供率",
        "f_jew" = "ユダヤ率",
        "f_fem" = "女性率",
        "f_ortsgeb" = "出身者率",
        "f_pruss" = "普人率",
        "hhsize" = "平均家計人数",
        "lnpop" = "対数人口",
        "gpop" = "人口成長率",
        "f_blind" = "視覚障害率",
        "f_deaf" = "聴覚障害率",
        "f_dumb" = "知的・精神障害率",
        "f_miss" = "欠落率",
        "(Intercept)" = "定数項")
# スタイガー=ストック検定統計量の値を表示するモデルを指定
attr(models_73$`(2)`, "STAIGERSTOCK") <- TRUE
attr(models_73$`(4)`, "STAIGERSTOCK") <- TRUE
attr(models_73$`(6)`, "STAIGERSTOCK") <- TRUE
# J検定統計量の値を表示するモデルを指定
attr(models_73$`(6)`, "J") <- TRUE

glance_custom.iv_robust <- function(x) {
    # 上で指定した, スタイガー=ストック検定統計量の値を表示したいモデルでなければパス
    if (!isTRUE(attr(x, "STAIGERSTOCK"))) return(NULL)

    # スタイガー=ストック検定統計量の値を取得
    staigerstock <- summary(x)$diagnostic_first_stage_fstatistic

    # スタイガー=ストック検定統計量の値をまとめたtibbleを作成
    out <- tibble("staiger_stock_test" = staigerstock["value"],
                  "adj.r.squared" = "") # adjR2を消す

    # J検定統計量の値を表示するモデル
    if (isTRUE(attr(x, "J"))) {
        # J検定量の値を取得
        j <- summary(x)$diagnostic_overid_test
        # tibbleにJ検定量の値とそのP値を追加
        out <- out %>% mutate("j_test" = j["value"],
                              "p_value" = sprintf("(%.3f)", j["p.value"]))
    }
    return(out)
}
gm <- tribble(~raw,                 ~clean,                          ~fmt,
              "staiger_stock_test", "スタイガー=ストック検定統計量", 2,
              "j_test",             "$J$検定統計量",                 3,
              "p_value",            "     ",                         3,
              "adj.r.squared",      "$\\bar{R}^2$",                  3,
              "nobs",               "サンプルサイズ",                0)
rows <- tribble(~term, ~`(1)`, ~`(2)`, ~`(3)`, ~`(4)`, ~`(5)`, ~`(6)`,
                "被説明変数", "識字率", "識字率", "識字率", "識字率", "新教徒率", "識字率",
                "推定法", "OLS", "2SLS", "OLS", "2SLS", "OLS", "2SLS")
attr(rows, 'position') <- c(1, 2)

# kableExtraで出力して, 手動で下線を追加する.
# ただしkableExtraでHTML出力するとフッター(星の説明)の"<"の文字が消えてしまうので, estimateを手動してすることでフッターを自動生成させないようにし, notesで手動で追加する.
modelsummary(models_73,
             stars = c("*" = 0.05, "**" = 0.01, "***" = 0.001),
             gof_map = gm,
             coef_map = cm,
             add_rows = rows,
             estimate = "{estimate}{stars}",
             output = "kableExtra",
             notes = "* p &lt; 0.05, ** p &lt; 0.01, *** p &lt; 0.001") %>%
    row_spec(c(0, 2, 35, 37), extra_css = "border-bottom: 1.5px solid") %>%
    row_spec(32, extra_css = ";border-bottom: 1.5px solid") # 32行目の下 (estimateとstatisticsの境) のみコロンが必要

	(1)	(2)	(3)	(4)	(5)	(6)
被説明変数	識字率	識字率	識字率	識字率	新教徒率	識字率
推定法	OLS	2SLS	OLS	2SLS	OLS	2SLS
新教徒率	0.080***	0.422***	0.099***	0.189***		0.185***
	(0.016)	(0.071)	(0.010)	(0.027)		(0.026)
距離					−0.095***
					(0.011)
子供率			−1.936***	−1.952***	0.205	−1.952***
			(0.158)	(0.168)	(0.765)	(0.167)
ユダヤ率			−0.965**	−0.437	−7.264***	−0.455
			(0.329)	(0.350)	(1.231)	(0.346)
女性率			−1.280***	−1.073**	−0.557	−1.080**
			(0.301)	(0.331)	(1.302)	(0.330)
出身者率			0.484***	0.607***	−1.390***	0.602***
			(0.033)	(0.052)	(0.133)	(0.050)
普人率			−0.324*	−0.181	−1.935*	−0.186
			(0.135)	(0.142)	(0.790)	(0.141)
平均家計人数			−1.812	0.885	−14.610*	0.790
			(1.218)	(1.643)	(5.931)	(1.608)
対数人口			−1.183	−1.318	−0.977	−1.313
			(0.872)	(0.957)	(3.966)	(0.952)
人口成長率			0.186*	0.410***	−1.962***	0.402***
			(0.089)	(0.109)	(0.326)	(0.107)
視覚障害率			−45.200***	−27.865	−82.266	−28.477
			(13.113)	(14.985)	(57.017)	(14.690)
聴覚障害率			−47.188***	−52.383***	104.616***	−52.200***
			(9.759)	(10.412)	(31.533)	(10.373)
知的・精神障害率			7.475***	7.529***	−2.314	7.527***
			(1.411)	(1.706)	(8.732)	(1.687)
欠落率			−0.307	−0.505	1.729	−0.498
			(0.383)	(0.378)	(1.484)	(0.377)
定数項	82.374***	60.451***	227.884***	177.367***	478.896***	179.151***
	(1.413)	(5.113)	(24.391)	(29.496)	(119.492)	(29.133)
スタイガー=ストック検定統計量		46.81		68.95		29.79
\(J\)検定統計量						0.382
						(0.826)
\(\bar{R}^2\)	0.055		0.729		0.402
サンプルサイズ	452	452	452	452	452	452
* p < 0.05, p < 0.01, * p < 0.001

練習問題 7-13 [実証]

表7-3を参照せよ.

第7章 操作変数法