sensitivity_statistics#

Description#

Computes the sensitivity statistics: robustness value, partial R2, and Cohen’s f2; plus helper functions

Reference#

Cinelli, C. and Hazlett, C. (2020), “Making Sense of Sensitivity: Extending Omitted Variable Bias.” Journal of the Royal Statistical Society, Series B (Statistical Methodology).

Functions#

sensemakr.robustness_value(model=None, covariates=None, t_statistic=None, dof=None, q=1, alpha=1.0)[source]#

Compute the robustness value of a regression coefficient.

The robustness value describes the minimum strength of association (parameterized in terms of partial R2) that omitted variables would need to have both with the treatment and with the outcome to change the estimated coefficient by a certain amount (for instance, to bring it down to zero).

For instance, a robustness value of 1% means that an unobserved confounder that explain 1% of the residual variance of the outcome and 1% of the residual variance of the treatment is strong enough to explain away the estimated effect. Whereas a robustness value of 90% means that any unobserved confounder that explain less than 90% of the residual variance of both the outcome and the treatment assignment cannot fully account for the observed effect. You may also compute robustness value taking into account sampling uncertainty. See details in Cinelli and Hazlett (2020).

The function robustness_value can take as input a statsmodels OLSResults object or you may directly pass the t-value and degrees of freedom.

Required parameters: either model or t_statistic and dof.

Parameters:
  • model (statsmodels OLSResults object) – a statsmodels OLSResults object containing the restricted regression.

  • covariates (string) – a string or list of strings with the names of the variables to use for benchmark bounding.

  • t_statistic (float) – a float with the t_statistic for the restricted model regression.

  • dof (int) – an int with the degrees of freedom of the restricted regression.

  • q (float) – a float with the percent to reduce the point estimate by for the robustness value RV_q (Default value = 1).

  • alpha (float) – a float with the significance level for the robustness value RV_qa to render the estimate not significant (Default value = 1.0).

Returns:

a numpy array with the robustness value

Return type:

numpy array

Examples

>>> # Load example dataset
>>> import sensemakr as smkr
>>> darfur = smkr.load_darfur()
>>> # Fit a statsmodels OLSResults object ("fitted_model")
>>> import statsmodels.formula.api as smf
>>> model = smf.ols(formula='peacefactor ~ directlyharmed + age + farmer_dar + herder_dar + pastvoted + hhsize_darfur + female + village', data=darfur)
>>> fitted_model = model.fit()
>>> # Robustness value of directly harmed q =1 (reduce estimate to zero):
>>> smkr.robustness_value(model = fitted_model, covariates = "directlyharmed") 
>>> # Robustness value of directly harmed q = 1/2 (reduce estimate in half):
>>> smkr.robustness_value(model = fitted_model, covariates = "directlyharmed", q = 1/2) 
>>> # Robustness value of directly harmed q = 1/2, alpha = 0.05 (reduce estimate in half, with 95% confidence):
>>> smkr.robustness_value(model = fitted_model, covariates = "directlyharmed", q = 1/2, alpha = 0.05) 
>>> # You can also provide the statistics directly:
>>> smkr.robustness_value(t_statistic = 4.18445, dof = 783) 
sensemakr.sensitivity_stats(model=None, treatment=None, estimate=None, se=None, dof=None, q=1, alpha=0.05, reduce=True)[source]#

Computes the robustness_value, partial_r2 and partial_f2 of the coefficient of interest.

Required parameters: either model and treatment, or (estimate, se, and dof).

Parameters:
  • model (statsmodels OLSResults object) – a statsmodels OLSResults object containing the restricted regression.

  • treatment (string) – a string with treatment variable name.

  • estimate (float) – a float with the coefficient estimate of the restricted regression.

  • se (float) – a float with the standard error of the restricted regression.

  • dof (int) – an int with the degrees of freedom of the restricted regression.

  • q (float) – a float with the percent to reduce the point estimate by for the robustness value RV_q (Default value = 1).

  • alpha (float) – a float with the significance level for the robustness value RV_qa to render the estimate not significant (Default value = 0.05).

  • reduce (boolean) – whether to reduce or increase the estimate due to confounding (Default value = True).

Returns:

a Pandas DataFrame containing the following quantities:

treatment : a string with the name of the treatment variable.

estimate : a float with the estimated effect of the treatment.

se : a float with the estimated standard error of the treatment effect.

t_statistics : a float with the t-value of the treatment.

r2yd_x : a float with the partial R2 of the treatment and the outcome, see details in partial_r2.

rv_q : a float the robustness value of the treatment, see details in robustness_value.

rv_qa : a float with the robustness value of the treatment considering statistical significance, see details in robustness_value.

f2yd_x : a float with the partial (Cohen’s) f2 of the treatment with the outcome, see details in partial_f2.

dof : an int with the degrees of freedom of the model.

Return type:

Pandas DataFrame

Examples

>>> # Load example dataset:
>>> import sensemakr as smkr
>>> darfur = smkr.load_darfur()
>>> # Fit a statsmodels OLSResults object ("fitted_model"):
>>> import statsmodels.formula.api as smf
>>> model = smf.ols(formula='peacefactor ~ directlyharmed + age + farmer_dar + herder_dar + pastvoted + hhsize_darfur + female + village', data=darfur)
>>> fitted_model = model.fit()
>>> # Sensitivity stats for directly harmed:
>>> smkr.sensitivity_stats(model = fitted_model, treatment = "directlyharmed") 
>>> # You can  also pass the numeric values directly:
>>> smkr.sensitivity_stats(estimate = 0.09731582, se = 0.02325654, dof = 783) 
sensemakr.partial_r2(model=None, covariates=None, t_statistic=None, dof=None)[source]#

Compute the partial R2 for a linear regression model.

The partial R2 describes how much of the residual variance of the outcome (after partialing out the other covariates) a covariate explains.

The partial R2 can be used as an extreme-scenario sensitivity analysis to omitted variables. Considering an unobserved confounder that explains 100% of the residual variance of the outcome, the partial R2 describes how strongly associated with the treatment this unobserved confounder would need to be in order to explain away the estimated effect.

For details see Cinelli and Hazlett (2020).

Required parameters: either model or t_statistic and dof.

Parameters:
  • model (statsmodels OLSResults object) – a statsmodels OLSResults object containing the restricted regression.

  • covariates (string or list of strings) – a string or list of strings with the covariates used to compute the t_statistic and dof from the model. If not specified, defaults to all variables.

  • t_statistic (float) – a float with the t_statistic for the restricted model regression.

  • dof (int) – an int with the degrees of freedom of the restricted regression.

Returns:

a float with the computed partial R^2.

Return type:

float

Examples

This function takes as input a statsmodels OLSResults object or you may pass directly t-value & degrees of freedom. For partial R2 of groups of covariates, check group_partial_r2.

>>> # Load example dataset:
>>> import sensemakr as smkr
>>> darfur = smkr.load_darfur()
>>> # Fit a statsmodels OLSResults object ("fitted_model"):
>>> import statsmodels.formula.api as smf
>>> model = smf.ols(formula='peacefactor ~ directlyharmed + age + farmer_dar + herder_dar + pastvoted + hhsize_darfur + female + village', data=darfur)
>>> fitted_model = model.fit()
>>> # Partial R2 of directly harmed with peacefactor:
>>> float(smkr.partial_r2(model = fitted_model, covariates = "directlyharmed"))  
0.02187
>>> # Partial R2 of female with peacefactor:
>>> float(smkr.partial_r2(model = fitted_model, covariates = "female"))  
0.10903
>>> # You can also provide the statistics directly:
>>> float(smkr.partial_r2(t_statistic = 4.18445, dof = 783))  
0.021873
sensemakr.partial_f2(model=None, covariates=None, t_statistic=None, dof=None)[source]#

Compute the partial (Cohen’s) f2 for a linear regression model.

The partial (Cohen’s) f2 is a common measure of effect size (a transformation of the partial R2) that can also be used directly for sensitivity analysis using a bias factor table. For details see Cinelli and Hazlett (2020).

This function takes as input a statsmodels OLSResults object or you may pass directly t-value & degrees of freedom.

Required parameters: either model or (t_statistic and dof).

Parameters:
  • model (statsmodels OLSResults object) – a statsmodels OLSResults object containing the restricted regression.

  • covariates (string or list of strings) – a string or list of strings with the covariates used to compute the t_statistic and dof from the model. If not specified, defaults to all variables.

  • t_statistic (float) – a float with the t_statistic for the restricted model regression.

  • dof (int) – an int with the degrees of freedom of the restricted regression.

Returns:

a float with the computed partial f^2.

Return type:

float

Examples

>>> # Load example dataset:
>>> import sensemakr as smkr
>>> darfur = smkr.load_darfur()
>>> # Fit a statsmodels OLSResults object ("fitted_model"):
>>> import statsmodels.formula.api as smf
>>> model = smf.ols(formula='peacefactor ~ directlyharmed + age + farmer_dar + herder_dar + pastvoted + hhsize_darfur + female + village', data=darfur)
>>> fitted_model = model.fit()
>>> # Partial f2 of directly harmed with peacefactor:
>>> smkr.partial_f2(model = fitted_model, covariates = "directlyharmed") 
>>> # Partial f2 of female with peacefactor:
>>> smkr.partial_f2(model = fitted_model, covariates = "female") 
>>> # You can also provide the statistics directly:
>>> smkr.partial_f2(t_statistic = 4.18445, dof = 783) 
0.022362
sensemakr.group_partial_r2(model=None, covariates=None, f_statistic=None, p=None, dof=None)[source]#

Partial R2 of groups of covariates in a linear regression model.

This function computes the partial R2 of a group of covariates in a linear regression model. Multivariate version of the partial_r2 function; see that for more details.

Required parameters: either model or (f_statistic, p, and dof).

Parameters:
  • model (statsmodels OLSResults object) – a statsmodels OLSResults object containing the restricted regression.

  • covariates (string or list of strings) – a string or list of strings with the covariates used to compute the t_statistic and dof from the model. If not specified, defaults to all variables.

  • f_statistic (float) – a float with the f_statistic for the restricted model regression.

  • p (int) – an int with the number of parameters in the model.

  • dof (int) – an int with the degrees of freedom of the restricted regression.

Returns:

a float with the computed group partial R^2.

Return type:

float

Examples

>>> # Load example dataset:
>>> import sensemakr as smkr
>>> darfur = smkr.load_darfur()
>>> # Fit a statsmodels OLSResults object ("fitted_model"):
>>> import statsmodels.formula.api as smf
>>> model = smf.ols(formula='peacefactor ~ directlyharmed + age + farmer_dar + herder_dar + pastvoted + hhsize_darfur + female + village', data=darfur)
>>> fitted_model = model.fit()
>>> float(smkr.group_partial_r2(model = fitted_model, covariates = ["female", "pastvoted"])) 
0.11681