sensitivity_plots#

Description#

This module provides sensitivity contour plots and extreme scenario sensitivity plots. They can be used on an object of class Sensemakr, directly in an OLS statsmodel, or by providing the required statistics manually.

Functions#

sensemakr.ovb_contour_plot(sense_obj=None, sensitivity_of='estimate', model=None, treatment=None, estimate=None, se=None, dof=None, benchmark_covariates=None, kd=1, ky=None, r2dz_x=None, r2yz_dx=None, bound_label=None, reduce=True, estimate_threshold=0, t_threshold=2, lim=None, lim_y=None, col_contour='black', col_thr_line='red', label_text=True, label_bump_x=None, label_bump_y=None, xlab=None, ylab=None, plot_margin_fraction=0.05, round_dig=3, n_levels=None)[source]#

Contour plots of omitted variable bias for sensitivity analysis.

The main inputs are a statsmodel object, the treatment variable and the covariates used for benchmarking the strength of unobserved confounding.

The horizontal axis of the plot shows hypothetical values of the partial R2 of the unobserved confounder(s) with the treatment. The vertical axis shows hypothetical values of the partial R2 of the unobserved confounder(s) with the outcome. The contour levels represent the adjusted estimates (or t-values) of the treatment effect.

The reference points are the bounds on the partial R2 of the unobserved confounder if it were k times “as strong” as the observed covariates used for benchmarking (see arguments kd and ky). The dotted red line show the chosen critical threshold (for instance, zero): confounders with such strength (or stronger) are sufficient to invalidate the research conclusions. All results are exact for single confounders and conservative for multiple/nonlinear confounders.

See Cinelli and Hazlett (2020) for details.

Parameters:
  • sense_obj (sensemakr object) – a sensemakr object to plot.

  • sensitivity_of (string) – either “estimate” or “t-value”. (Default value = ‘estimate’).

  • model (statsmodels OLSResults object) – a fitted statsmodels OLSResults object.

  • treatment (string) – a string with the name of the “treatment” variable, e.g. the independent variable of interest.

  • estimate (float) – a float with the estimate of the coefficient for the independent variable of interest.

  • se (float) – a float with the standard error of the regression.

  • dof (int) – an int with the degrees of freedom of the regression.

  • benchmark_covariates (string or list of strings) – a string or list of strings with the names of the variables to use for benchmarking.

  • kd (float or list of floats) – a float or list of floats. Parameterizes how many times stronger the confounder is related to the treatment in comparison to the observed benchmark covariate. Default value is 1 (confounder is as strong as benchmark covariate).

  • ky (float or list of floats) – a float or list of floats. Parameterizes how many times stronger the confounder is related to the outcome in comparison to the observed benchmark covariate. Default value is the same as kd.

  • r2dz_x (float or list of floats) – a float or list of floats. Hypothetical partial R2 of unobserved confounder Z with treatment D, given covariates X.

  • r2yz_dx (float or list of floats) – a float or list of floats. Hypothetical partial R2 of unobserved confounder Z with outcome Y, given covariates X and treatment D.

  • bound_label (string) – label of the bound variable.

  • reduce (boolean) – whether to reduce (True, default) or increase (False) the estimate due to putative confounding, default is True.

  • estimate_threshold (float) – threshold line to emphasize when contours correspond to estimate, default is 0.

  • t_threshold (float) – threshold line to emphasize when contours correspond to t-value, default is 2.

  • lim (float) – x axis maximum.

  • lim_y (float) – y axis maximum.

  • col_contour (string of color) – color of the contour line, default is “black”.

  • col_thr_line (string of color) – color of the threshold line, default is “red”.

  • label_text (boolean) – whether to include label text.

  • label_bump_x (float) – x-axis position of label above 0.

  • label_bump_y (float) – y-axis position of label above 0.

  • xlab (string) – x-axis label text.

  • ylab (string) – y-axis label text.

  • plot_margin_fraction (float) – margin fraction added to the top of lim and lim_y.

  • round_dig (int) – rounding digit of the display numbers, default is 3.

  • n_levels (int) – maximum number of countours in the plot.

Returns:

a contour plot of omitted variable bias for the corresponding model/sense_obj.

Return type:

plot

Examples

>>> # Load example dataset:
>>> import sensemakr as smkr
>>> darfur = smkr.load_darfur()
>>> # Fit a statsmodels OLSResults object ("fitted_model")
>>> import statsmodels.formula.api as smf
>>> model = smf.ols(formula='peacefactor ~ directlyharmed + age + farmer_dar + herder_dar + pastvoted + hhsize_darfur + female + village', data=darfur)
>>> fitted_model = model.fit()
>>> # Contours directly from OLS object
>>> ## Plot contour of the fitted model with directlyharmed as treatment and "female" as benchmark_covariates.
>>> smkr.ovb_contour_plot(model=fitted_model,treatment='directlyharmed',benchmark_covariates='female')
>>> ## Plot contour of the fitted model with directlyharmed as treatment and "female" as benchmark_covariates kd=[1,2,3]
>>> smkr.ovb_contour_plot(model=fitted_model,treatment='directlyharmed',benchmark_covariates='female',kd=[1,2,3])
>>> ## Plot contour of the fitted model with manual benchmark
>>> smkr.ovb_contour_plot(model=fitted_model,treatment='directlyharmed',r2dz_x=0.1)
>>> # Contours from Sensemakr object
>>> sensitivity = smkr.Sensemakr(fitted_model, treatment = "directlyharmed", benchmark_covariates = "female", kd = [1, 2, 3])
>>> smkr.ovb_contour_plot(sense_obj=sensitivity, sensitivity_of='estimate')
sensemakr.add_bound_to_contour(model=None, benchmark_covariates=None, kd=1, ky=None, reduce=None, treatment=None, bounds=None, r2dz_x=None, r2yz_dx=None, bound_value=None, bound_label=None, sensitivity_of=None, label_text=True, label_bump_x=None, label_bump_y=None, round_dig=3)[source]#

Add bound label to the contour plot of omitted variable bias for sensitivity analysis.

The main inputs are a statsmodel object, the treatment variable and the covariates used for benchmarking the strength of unobserved confounding.

The reference points are the bounds on the partial R2 of the unobserved confounder if it were k times ‘’as strong’’ as the observed covariate used for benchmarking (see arguments kd and ky).

Parameters:
  • model (statsmodels OLSResults object) – a fitted statsmodels OLSResults object for the restricted regression model you have provided.

  • benchmark_covariates (string) – a string or list of strings with the names of the variables to use for benchmark bounding.

  • kd (float or list of floats) – a float or list of floats with each being a multiple of the strength of association between a benchmark variable and the treatment variable to test with benchmark bounding (Default value = 1).

  • ky (float or list of floats) – same as kd except measured in terms of strength of association with the outcome variable.

  • reduce (boolean) – whether to reduce (True, default) or increase (False) the estimate due to putative confounding.

  • treatment (string) – a string with the name of the “treatment” variable, e.g. the independent variable of interest.

  • bounds (pandas dataframe) – A pandas dataframe with bounds on the strength of confounding according to some benchmark covariates, as computed by the function ovb_bounds.

  • r2dz_x (float or list of floats) – a float or list of floats with the partial R^2 of a putative unobserved confounder “z” with the treatment variable “d”, with observed covariates “x” partialed out, as implied by z being kd-times as strong as the benchmark_covariates.

  • r2yz_dx (float or list of floats) – a float or list of floats with the partial R^2 of a putative unobserved confounder “z” with the outcome variable “y”, with observed covariates “x” and the treatment variable “d” partialed out, as implied by z being ky-times as strong as the benchmark_covariates.

  • bound_value (float) – the value of the reference point.

  • bound_label (string) – a string that label the reference point.

  • sensitivity_of (string) – either “estimate” or “t-value”.

  • label_text (boolean) – whether to include label text.

  • label_bump_x (float) – x-axis position of label above 0.

  • label_bump_y (float) – y-axis position of label above 0.

  • round_dig (int) – rounding digit of the display numbers, default=3.

Returns:

add a bound label to the existing contour plot.

Return type:

add on existing plot

Examples

>>> # Load example dataset:
>>> import sensemakr as smkr
>>> darfur = smkr.load_darfur()
>>> # Fit a statsmodels OLSResults object ("fitted_model"):
>>> import statsmodels.formula.api as smf
>>> model = smf.ols(formula='peacefactor ~ directlyharmed + age + farmer_dar + herder_dar + pastvoted + hhsize_darfur + female + village', data=darfur)
>>> fitted_model = model.fit()
>>> # Runs sensemakr for sensitivity analysis
>>> sensitivity = smkr.Sensemakr(fitted_model, treatment = "directlyharmed", benchmark_covariates = "female", kd = [1, 2, 3])
>>> # Plot contour of the fitted model with directlyharmed as treatment and "female" as benchmark_covariates.
>>> smkr.ovb_contour_plot(model=fitted_model,treatment='directlyharmed',benchmark_covariates='female')
>>> # Add bound to contour.
>>> smkr.add_bound_to_contour(model=fitted_model,treatment='directlyharmed',benchmark_covariates='female',kd=[2,3])
sensemakr.ovb_extreme_plot(sense_obj=None, model=None, treatment=None, estimate=None, se=None, dof=None, benchmark_covariates=None, kd=1, ky=None, r2dz_x=None, r2yz_dx=[1, 0.75, 0.5], reduce=True, threshold=0, lim=None, lim_y=None, xlab=None, ylab=None)[source]#

Extreme scenario plots of omitted variable bias for sensitivity analysis.

The main inputs are a statsmodel object, the treatment variable and the covariates used for benchmarking the strength of unobserved confounding.

The horizontal axis shows the partial R2 of the unobserved confounder(s) with the treatment. The vertical axis shows the adjusted treatment effect estimate. The partial R2 of the confounder with the outcome is represented by different curves for each scenario, as given by the parameter r2yz_dx. The red marks on horizontal axis are bounds on the partial R2 of the unobserved confounder kd times as strong as the covariates used for benchmarking. The dotted red line represent the threshold for the effect estimate deemed to be problematic (for instance, zero).

See Cinelli and Hazlett (2020) for details.

Parameters:
  • sense_obj (sensemakr object) – a sensemakr object.

  • model (statsmodels OLSResults object) – a fitted statsmodels OLSResults object for the restricted regression model you have provided.

  • treatment (string) – a string with the name of the “treatment” variable, e.g. the independent variable of interest.

  • estimate (float) – a float with the estimate of the coefficient for the independent variable of interest.

  • se (float) – a float with the standard error of the regression.

  • dof (float) – an int with the degrees of freedom of the regression.

  • benchmark_covariates (string or list of strings) – a string or list of strings with the names of the variables to use for benchmark bounding.

  • kd (float or list of floats) – a float or list of floats with each being a multiple of the strength of association between a benchmark variable and the treatment variable to test with benchmark bounding (Default value = 1).

  • ky (float or list of floats) – same as kd except measured in terms of strength of association with the outcome variable.

  • r2dz_x (float or list of floats) – a float or list of floats with the partial R^2 of a putative unobserved confounder “z” with the treatment variable “d”, with observed covariates “x” partialed out, as implied by z being kd-times as strong as the benchmark_covariates.

  • r2yz_dx (float or list of floats) – a float or list of floats with the partial R^2 of a putative unobserved confounder “z” with the outcome variable “y”, with observed covariates “x” and the treatment variable “d” partialed out, as implied by z being ky-times as strong as the benchmark_covariates, default=[1,0.75,0.5].

  • reduce (boolean) – whether to reduce (True, default) or increase (False) the estimate due to putative confounding, default=True.

  • threshold (float) – threshold line to emphasize when drawing estimate, default=0.

  • lim (float) – range of x-axis.

  • lim_y (float) – range of y-axis.

  • xlab (string) – x-axis label text.

  • ylab (string) – y-axis label text.

Returns:

an extreme value plot of omitted variable bias for the corresponding model/sense_obj.

Return type:

plot

Examples

>>> # Load example dataset:
>>> import sensemakr as smkr
>>> darfur = smkr.load_darfur()
>>> # Fit a statsmodels OLSResults object ("fitted_model"):
>>> import statsmodels.formula.api as smf
>>> model = smf.ols(formula='peacefactor ~ directlyharmed + age + farmer_dar + herder_dar + pastvoted + hhsize_darfur + female + village', data=darfur)
>>> fitted_model = model.fit()
>>> # Runs sensemakr for sensitivity analysis
>>> sensitivity = smkr.Sensemakr(fitted_model, treatment = "directlyharmed", benchmark_covariates = "female", kd = [1, 2, 3])
>>> # Plot extreme value of the fitted model with directlyharmed as treatment and "female" as benchmark_covariates.
>>> smkr.ovb_extreme_plot(model=fitted_model,treatment='directlyharmed',benchmark_covariates='female')
>>> # Plot extreme value of the fitted model with directlyharmed as treatment and "female" as benchmark_covariates kd=[1,2].
>>> smkr.ovb_extreme_plot(model=fitted_model,treatment='directlyharmed',benchmark_covariates='female',kd=[1,2])
>>> # Plot extreme value of the fitted model with manual benchmark
>>> smkr.ovb_extreme_plot(model=fitted_model,treatment='directlyharmed',r2dz_x=0.1)
>>> # Plot extreme value of the sensemakr object
>>> smkr.ovb_extreme_plot(sense_obj=sensitivity)