sensitivity_bounds#

Description#

Bounds on the strength of unobserved confounders using observed covariates, as in Cinelli and Hazlett (2020). The main generic function is ovb_bounds, which can compute both the bounds on the strength of confounding as well as the adjusted estimates, standard errors, t-values and confidence intervals.

Other functions that compute only the bounds on the strength of confounding are also provided. These functions may be useful when computing benchmarks for using only summary statistics from papers you see in print.

Currently it implements only the bounds based on partial R2. Other bounds will be implemented soon.

Reference#

Cinelli, C. and Hazlett, C. (2020), “Making Sense of Sensitivity: Extending Omitted Variable Bias.” Journal of the Royal Statistical Society, Series B (Statistical Methodology).

Examples#

Load example dataset

>>> import sensemakr as smkr
>>> darfur = smkr.load_darfur()

Fit a statsmodels OLSResults object (“fitted_model”)

>>> import statsmodels.formula.api as smf
>>> model = smf.ols(formula='peacefactor ~ directlyharmed + age + farmer_dar + herder_dar +pastvoted + hhsize_darfur + female + village', data=darfur)
>>> fitted_model = model.fit()

Bounds on the strength of confounders 1, 2, or 3 times as strong as female and 1, 2, or 3 times as strong as pastvoted

>>> smkr.ovb_bounds(model = fitted_model, treatment = "directlyharmed", benchmark_covariates = ["female", "pastvoted"], kd = [1, 2, 3]) 

Functions#

sensemakr.ovb_bounds(model, treatment, benchmark_covariates=None, kd=1, ky=None, alpha=0.05, h0=0, reduce=True, bound='partial r2', adjusted_estimates=True)[source]#

Provide bounds on the strength of unobserved confounders using observed covariates, as in Cinelli and Hazlett (2020).

The main generic function is ovb_bounds, which can compute both the bounds on the strength of confounding as well as the adjusted estimates, standard errors, t-values and confidence intervals.

Other functions that compute only the bounds on the strength of confounding are also provided. These functions may be useful when computing benchmarks for using only summary statistics from papers you see in print.

Currently it implements only the bounds based on partial R2. Other bounds will be implemented soon.

Required parameters:

model and treatment.

Parameters:
  • model (statsmodels OLSResults object) – a fitted statsmodels OLSResults object for the restricted regression model you have provided.

  • treatment (string) – a string with the name of the “treatment” variable, e.g. the independent variable of interest.

  • benchmark_covariates (string or list of strings) – a string or list of strings with names of the variables to use for benchmark bounding.

  • kd (float or list of floats) – a float or list of floats with each being a multiple of the strength of association between a benchmark variable and the treatment variable to test with benchmark bounding (Default value = 1).

  • ky (float or list of floats) – same as kd except measured in terms of strength of association with the outcome variable.

  • alpha (float) – a float with the significance level for the robustness value RV_qa to render the estimate not significant (Default value = 0.05).

  • h0 (float) – a float with the null hypothesis effect size; defaults to 0.

  • reduce (boolean) – whether to reduce (True, default) or increase (False) the estimate due to putative confounding.

  • bound (string) – type of bound to perform; as of now, only partial R^2 bounding is allowed (Default value = ‘partial r2’).

  • adjusted_estimates (boolean) – whether to compute bias-adjusted estimates, standard errors, and t-statistics (Default value = True).

Returns:

A Pandas DataFrame containing the following variables:

treatment : the name of the provided treatment variable.

bound_label : a string created by label_maker to serve as a label for the bound for printing & plotting purposes.

r2dz_x : a float or list of floats with the partial R^2 of a putative unobserved confounder “z” with the treatment variable “d”, with observed covariates “x” partialed out, as implied by z being kd-times as strong as the benchmark_covariates.

r2yz_dx : a float or list of floats with the partial R^2 of a putative unobserved confounder “z” with the outcome variable “y”, with observed covariates “x” and the treatment variable “d” partialed out, as implied by z being ky-times as strong as the benchmark_covariates.

adjusted_estimate : the bias-adjusted estimate adjusted for a confounder with the given r2dz_x and r2yz_dx above.

adjusted_se : the bias-adjusted standard error adjusted for a confounder with the given r2dz_x and r2yz_dx above.

adjusted_t : the bias-adjusted t-statistic adjusted for a confounder with the given r2dz_x and r2yz_dx above.

Return type:

Pandas DataFrame

Example

>>> # Load example dataset
>>> import sensemakr as smkr
>>> darfur = smkr.load_darfur()
>>> # Fit a statsmodels OLSResults object ("fitted_model")
>>> import statsmodels.formula.api as smf
>>> model = smf.ols(formula='peacefactor ~ directlyharmed + age + farmer_dar + herder_dar + pastvoted + hhsize_darfur + female + village', data=darfur)
>>> fitted_model = model.fit()
>>> # Bounds on the strength of confounders 1, 2, or 3 times as strong as female
>>> # and 1, 2, or 3 times as strong as pastvoted
>>> smkr.ovb_bounds(model = fitted_model, treatment = "directlyharmed", benchmark_covariates = ["female", "pastvoted"], kd = [1, 2, 3]) 
sensemakr.ovb_partial_r2_bound(model=None, treatment=None, r2dxj_x=None, r2yxj_dx=None, benchmark_covariates=None, kd=1, ky=None)[source]#

Provide a Pandas DataFrame with the bounds on the strength of the unobserved confounder.

Adjusted estimates, standard errors and t-values (among other quantities) need to be computed manually by the user using those bounds with the functions adjusted_estimate, adjusted_se and adjusted_t.

Required parameters:

(model and treatment) or (r2dxj_x and r2yxj_dx).

Parameters:
  • model (statsmodels OLSResults object) – a fitted statsmodels OLSResults object for the restricted regression model you have provided.

  • treatment (string) – a string with the name of the “treatment” variable, e.g. the independent variable of interest.

  • r2dxj_x (float) – float with the partial R2 of covariate Xj with the treatment D (after partialling out the effect of the remaining covariates X, excluding Xj).

  • r2yxj_dx (float) – float with the partial R2 of covariate Xj with the outcome Y (after partialling out the effect of the remaining covariates X, excluding Xj).

  • benchmark_covariates (string or list of strings) – a string or list of strings with names of the variables to use for benchmark bounding.

  • kd (float or list of floats) – a float or list of floats with each being a multiple of the strength of association between a benchmark variable and the treatment variable to test with benchmark bounding (Default value = 1).

  • ky (float or list of floats) – same as kd except measured in terms of strength of association with the outcome variable (Default value = None).

Returns:

A Pandas DataFrame containing the following variables:

bound_label : a string created by label_maker to serve as a label for the bound for printing & plotting purposes.

r2dz_x : a float or list of floats with the partial R^2 of a putative unobserved confounder “z” with the treatment variable “d”, with observed covariates “x” partialed out, as implied by z being kd-times as strong as the benchmark_covariates.

r2yz_dx : a float or list of floats with the partial R^2 of a putative unobserved confounder “z” with the outcome variable “y”, with observed covariates “x” and the treatment variable “d” partialed out, as implied by z being ky-times as strong as the benchmark_covariates.

Return type:

Pandas DataFrame

Examples

Let’s construct bounds from summary statistics only. Suppose you didn’t have access to the data, but only to the treatment and outcome regression tables. You can still compute the bounds.

>>> # First import the necessary libraries.
>>> import sensemakr as smkr
>>> # Use the t statistic of female in the outcome regression to compute the partial R2 of female with the outcome.
>>> r2yxj_dx = smkr.partial_r2(t_statistic = -9.789, dof = 783)
>>> # Use the t-value of female in the *treatment* regression to compute the partial R2 of female with the treatment.
>>> r2dxj_x = smkr.partial_r2(t_statistic = -2.680, dof = 783)
>>> # Compute manually bounds on the strength of confounders 1, 2, or 3 times as strong as female.
>>> bounds = smkr.ovb_partial_r2_bound(r2dxj_x = r2dxj_x, r2yxj_dx = r2yxj_dx,kd = [1, 2, 3], ky = [1, 2, 3])
>>> # Compute manually adjusted estimates.
>>> bound_values = smkr.adjusted_estimate(estimate = 0.0973, se = 0.0232, dof = 783, r2dz_x = bounds['r2dz_x'], r2yz_dx = bounds['r2yz_dx'])
>>> # Plot contours and bounds.
>>> smkr.ovb_contour_plot(estimate = 0.0973, se = 0.0232, dof = 783)
>>> smkr.add_bound_to_contour(bounds=bounds, bound_value = bound_values)