sensitivity_bounds#

Description#

Bounds on the strength of unobserved confounders using observed covariates, as in Cinelli and Hazlett (2020). The main generic function is ovb_bounds, which can compute both the bounds on the strength of confounding as well as the adjusted estimates, standard errors, t-values and confidence intervals.

Other functions that compute only the bounds on the strength of confounding are also provided. These functions may be useful when computing benchmarks for using only summary statistics from papers you see in print.

Currently it implements only the bounds based on partial R2. Other bounds will be implemented soon.

Reference#

Cinelli, C. and Hazlett, C. (2020), “Making Sense of Sensitivity: Extending Omitted Variable Bias.” Journal of the Royal Statistical Society, Series B (Statistical Methodology).

Examples#

Load example dataset

>>> import sensemakr as smkr
>>> darfur = smkr.load_darfur()

Fit a statsmodels OLSResults object (“fitted_model”)

>>> import statsmodels.formula.api as smf
>>> model = smf.ols(formula='peacefactor ~ directlyharmed + age + farmer_dar + herder_dar +pastvoted + hhsize_darfur + female + village', data=darfur)
>>> fitted_model = model.fit()

Bounds on the strength of confounders 1, 2, or 3 times as strong as female and 1, 2, or 3 times as strong as pastvoted

>>> smkr.ovb_bounds(model = fitted_model, treatment = "directlyharmed", benchmark_covariates = ["female", "pastvoted"], kd = [1, 2, 3]) 

Functions#

sensemakr.ovb_bounds(model, treatment, benchmark_covariates=None, kd=1, ky=None, alpha=0.05, h0=0, reduce=True, bound='partial r2', adjusted_estimates=True)[source]#

Provide bounds on the strength of unobserved confounders using observed covariates, as in Cinelli and Hazlett (2020).

The main generic function is ovb_bounds, which can compute both the bounds on the strength of confounding as well as the adjusted estimates, standard errors, t-values and confidence intervals.

Currently it implements only the bounds based on partial R2. Other bounds will be implemented soon.

Required parameters:

model and treatment.

Parameters:

model (statsmodels OLSResults object) – a fitted statsmodels OLSResults object for the restricted regression model you have provided.
treatment (string) – a string with the name of the “treatment” variable, e.g. the independent variable of interest.
benchmark_covariates (string or list of strings) – a string or list of strings with names of the variables to use for benchmark bounding.
kd (float or list of floats) – a float or list of floats with each being a multiple of the strength of association between a benchmark variable and the treatment variable to test with benchmark bounding (Default value = 1).
ky (float or list of floats) – same as kd except measured in terms of strength of association with the outcome variable.
alpha (float) – a float with the significance level for the robustness value RV_qa to render the estimate not significant (Default value = 0.05).
h0 (float) – a float with the null hypothesis effect size; defaults to 0.
reduce (boolean) – whether to reduce (True, default) or increase (False) the estimate due to putative confounding.
bound (string) – type of bound to perform; as of now, only partial R^2 bounding is allowed (Default value = ‘partial r2’).
adjusted_estimates (boolean) – whether to compute bias-adjusted estimates, standard errors, and t-statistics (Default value = True).

Returns:

A Pandas DataFrame containing the following variables:

treatment : the name of the provided treatment variable.

bound_label : a string created by label_maker to serve as a label for the bound for printing & plotting purposes.

r2dz_x : a float or list of floats with the partial R^2 of a putative unobserved confounder “z” with the treatment variable “d”, with observed covariates “x” partialed out, as implied by z being kd-times as strong as the benchmark_covariates.

r2yz_dx : a float or list of floats with the partial R^2 of a putative unobserved confounder “z” with the outcome variable “y”, with observed covariates “x” and the treatment variable “d” partialed out, as implied by z being ky-times as strong as the benchmark_covariates.

adjusted_estimate : the bias-adjusted estimate adjusted for a confounder with the given r2dz_x and r2yz_dx above.

adjusted_se : the bias-adjusted standard error adjusted for a confounder with the given r2dz_x and r2yz_dx above.

adjusted_t : the bias-adjusted t-statistic adjusted for a confounder with the given r2dz_x and r2yz_dx above.

Return type:

Pandas DataFrame

Example

>>> # Load example dataset
>>> import sensemakr as smkr
>>> darfur = smkr.load_darfur()
>>> # Fit a statsmodels OLSResults object ("fitted_model")
>>> import statsmodels.formula.api as smf
>>> model = smf.ols(formula='peacefactor ~ directlyharmed + age + farmer_dar + herder_dar + pastvoted + hhsize_darfur + female + village', data=darfur)
>>> fitted_model = model.fit()
>>> # Bounds on the strength of confounders 1, 2, or 3 times as strong as female
>>> # and 1, 2, or 3 times as strong as pastvoted
>>> smkr.ovb_bounds(model = fitted_model, treatment = "directlyharmed", benchmark_covariates = ["female", "pastvoted"], kd = [1, 2, 3]) 

sensemakr.ovb_partial_r2_bound(model=None, treatment=None, r2dxj_x=None, r2yxj_dx=None, benchmark_covariates=None, kd=1, ky=None)[source]#

Provide a Pandas DataFrame with the bounds on the strength of the unobserved confounder.

Adjusted estimates, standard errors and t-values (among other quantities) need to be computed manually by the user using those bounds with the functions adjusted_estimate, adjusted_se and adjusted_t.