Instrumental variable analysis with binary outcome

Thanks guys! The 2sls approach does not recover the DGP. Perhaps this suggests that the the effect can’t be estimated with IV regression. However, I would love to understand why…

Here is what I get with 2sls.

from linearmodels.iv import IV2SLS
formula = 'Y ~ 1 + [X ~ I]'
mod = IV2SLS.from_formula(formula, dat_sim)
iv_res = mod.fit()
print(iv_res)

Results:

                          IV-2SLS Estimation Summary                          
==============================================================================
Dep. Variable:                      Y   R-squared:                      0.0007
Estimator:                    IV-2SLS   Adj. R-squared:                -0.0003
No. Observations:                1000   F-statistic:                    0.0061
Date:                Tue, Jul 09 2024   P-value (F-stat)                0.9376
Time:                        23:06:46   Distribution:                  chi2(1)
Cov. Estimator:                robust                                         
                                                                              
                             Parameter Estimates                              
==============================================================================
            Parameter  Std. Err.     T-stat    P-value    Lower CI    Upper CI
------------------------------------------------------------------------------
Intercept      0.2754     0.0363     7.5762     0.0000      0.2041      0.3466
X              0.0111     0.1420     0.0783     0.9376     -0.2672      0.2894
==============================================================================

Endogenous: X
Instruments: I
Robust Covariance (Heteroskedastic)
Debiased: False

And using causalpy:

import causalpy as cp
from causalpy.pymc_experiments import InstrumentalVariable
from causalpy.pymc_models import InstrumentalVariableRegression

sample_kwargs = {
    "tune": 1000,
    "draws": 2000,
    "chains": 4,
    "cores": 4,
    "target_accept": 0.99,
}
instruments_formula = "X  ~ 1 + I"
formula = "Y ~  1 + X"
instruments_data = dat_sim[["X", "I"]]
data = dat_sim[["Y", "X"]]
iv = InstrumentalVariable(
    instruments_data=instruments_data,
    data=data,
    instruments_formula=instruments_formula,
    formula=formula,
    model=InstrumentalVariableRegression(sample_kwargs=sample_kwargs),
)

az.summary(iv.model.idata, var_names=["beta_t", "beta_z"])[
    ["mean", "sd", "hdi_3%", "hdi_97%", "r_hat"]
]

Gives us:

	mean	sd	hdi_3%	hdi_97%	r_hat
beta_t[Intercept]	0.030	0.031	-0.029	0.085	1.0
beta_t[I]	0.252	0.034	0.189	0.316	1.0
beta_z[Intercept]	0.277	0.037	0.211	0.345	1.0
beta_z[X]	0.006	0.142	-0.273	0.259	1.0
2 Likes