causallib.estimation.doubly_robust module
Copyright 2019 IBM Corp.
Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
Created on Apr 25, 2018
A module implementing several doubly-robust methods. These methods utilize causal standardization model and causal weight models and combine them to hopefully achieve a better model. The exact way to combine differs from different models and is described in the class docstring.
- class causallib.estimation.doubly_robust.AIPW(outcome_model, weight_model, outcome_covariates=None, weight_covariates=None, overlap_weighting=False)[source]
Bases:
causallib.estimation.doubly_robust.BaseDoublyRobust
Calculates a doubly-robust estimate of the treatment effect by performing potential-outcome prediction (outcome_model) and then correcting its prediction-residuals using re-weighting from a treatment model (weight_model, like IPW).
It has two flavors, which slightly change the weighting of the outcome model in the correction term. Let e(X) be the estimated propensity score and m(X, A) is the estimated effect by an estimator, then the individual estimates are:
m(X,1) + A*(Y-m(X,1))/e(X), andm(X,0) + (1-A)*(Y-m(X,0))/(1-e(X))Which are basically add IP-weighted residuals from the observed predictions. As described in Kang and Schafer (2007) section 3.1 and Robins, Rotnitzky, and Zhao (1994).
The additional flavor when overlap_weighting=True is from Glynn and Quinn (2010), adds weighting by the propensity-of-the-other-class to the outcome model, so extreme example (with poor covariate overlap) will contribute less to the correction (i.e. rely less on their prediction value that might extrapolate too much). This is a similar notion used in Overlap Weights model (hence the argument name)
A * [Y - (1-e(X))m(X,1)]/e(X) + (1-A) * m(X,1), and(1-A) * [Y - e(X)m(X,0)]/(1-e(X)) + A * m(X,0)- Parameters
outcome_model (IndividualOutcomeEstimator) – A causal model that estimate on individuals level (e.g. Standardization).
weight_model (WeightEstimator | PropensityEstimator) – A causal model for weighting individuals (e.g. IPW). If overlap_weighting=True then must be a PropensityEstimator model.
outcome_covariates (array) – Covariates to use for outcome model. If None - all covariates passed will be used. Either list of column names or boolean mask.
weight_covariates (array) – Covariates to use for weight model. If None - all covariates passed will be used. Either list of column names or boolean mask.
overlap_weighting (bool) – Whether to tweak the outcome-model correction-term to rely less on data-points with poor covariate overlap (extreme propensity). if True, requires weight_model to be an instance of PropensityEstimator.
References
Kang and Schafer, 2007, (https://dx.doi.org/10.1214/07-STS227)
Kang and Schafer attribute the original method to Cassel, Särndal and Wretman.
Glynn and Quinn, 2010, https://doi.org/10.1093/pan/mpp036
Robins, Rotnitzky, and Zhao, 1994, https://doi.org/10.1080/01621459.1994.10476818
- estimate_effect(outcome1, outcome2, agg='population', effect_types='diff')[source]
Estimates an effect given two potential outcomes.
- Parameters
outcome1 (pd.Series) – A potential outcome.
outcome2 (pd.Series) – A potential outcome.
agg (str) – Either “population” or “individual” - whether to calculate individual effect or population effect.
effect_types (list[str] | str) – Any iterable of strings from the set of EffectEstimator.CALCULATE_EFFECT keys
- Returns
- A Series if population effect (input is scalar) with index are the effect types
and values are the corresponding computed effect. A DataFrame if individual effect (input is a vector) where columns are effects types and rows are effect in each individual. Always: Value type is the same as outcome_1 and outcome_2 type.
- Return type
pd.Series | pd.DataFrame
- estimate_individual_outcome(X, a, treatment_values=None, predict_proba=None)[source]
Estimates individual outcome under different treatment values (interventions).
Notes
This method utilizes only the standardization model behind the doubly-robust model. Namely, this is an uncorrected outcome (that does not incorporates the weighted observed outcome). To get a true doubly-robust estimation use the estimate_population_outcome, rather than an individual outcome.
- Parameters
X (pd.DataFrame) – Covariate matrix of size (num_subjects, num_features).
a (pd.Series) – Treatment assignment of size (num_subjects,).
treatment_values (Any) – Desired treatment value/s to use when estimating the counterfactual outcome/ If not supplied, calculates for all available treatment values.
predict_proba (bool | None) – In case the outcome task is classification and in case learner supports the operation, if True - prediction will utilize learner’s predict_proba or decision_function which returns a continuous matrix of size (n_samples, n_classes). If False - predict will be used and return value will be based on a vector of class classifications.
- Returns
- DataFrame which columns are treatment values and rows are individuals: each column is a vector
size (num_samples,) that contains the estimated outcome for each individual under the treatment value in the corresponding key.
- Return type
pd.DataFrame
- estimate_population_outcome(X, a, y=None, treatment_values=None, predict_proba=None, agg_func='mean')[source]
Doubly-robust averaging, combining the individual counterfactual predictions from the standardization model and the weighted observed outcomes to estimate population outcome for each treatment subgroup.
- Parameters
X (pd.DataFrame) – Covariate matrix of size (num_subjects, num_features).
a (pd.Series) – Treatment assignment of size (num_subjects,).
y (pd.Series) – Observed outcome of size (num_subjects,).
treatment_values (Any) – Desired treatment value/s to stratify upon before aggregating individual into population outcome. If not supplied, calculates for all available treatment values.
predict_proba (bool | None) – To be used when provoking estimate_individual_outcome. In case the outcome task is classification and in case learner supports the operation, if True - prediction will utilize learner’s predict_proba or decision_function which returns a continuous matrix of size (n_samples, n_classes). If False - predict will be used and return value will be based on a vector of class classifications.
agg_func (str) – Type of aggregation function (e.g. “mean” or “median”).
- Returns
- Series which index are treatment values, and the values are numbers - the aggregated outcome for
the strata of people whose assigned treatment is the key.
- Return type
pd.Series
- fit(X, a, y, refit_weight_model=True, **kwargs)[source]
Trains a causal model from observed data.
- Parameters
X (pd.DataFrame) – Covariate matrix of size (num_subjects, num_features).
a (pd.Series) – Treatment assignment of size (num_subjects,).
y (pd.Series) – Observed outcome of size (num_subjects,).
sample_weight – To be passed to the underlining scikit-learn’s fit method.
- Returns
A causal weight model with an inner learner fitted.
- Return type
- class causallib.estimation.doubly_robust.BaseDoublyRobust(outcome_model, weight_model, outcome_covariates=None, weight_covariates=None)[source]
Bases:
causallib.estimation.base_estimator.IndividualOutcomeEstimator
Abstract class defining the interface and general initialization of specific doubly-robust methods.
- Parameters
outcome_model (IndividualOutcomeEstimator) – A causal model that estimate on individuals level (e.g. Standardization).
weight_model (WeightEstimator) – A causal model for weighting individuals (e.g. IPW).
outcome_covariates (array) – Covariates to use for outcome model. If None - all covariates passed will be used. Either list of column names or boolean mask.
weight_covariates (array) – Covariates to use for weight model. If None - all covariates passed will be used. Either list of column names or boolean mask.
- abstract fit(X, a, y, refit_weight_model=True, **kwargs)[source]
Trains a causal model from observed data.
- Parameters
X (pd.DataFrame) – Covariate matrix of size (num_subjects, num_features).
a (pd.Series) – Treatment assignment of size (num_subjects,).
y (pd.Series) – Observed outcome of size (num_subjects,).
sample_weight – To be passed to the underlining scikit-learn’s fit method.
- Returns
A causal weight model with an inner learner fitted.
- Return type
- class causallib.estimation.doubly_robust.PropensityFeatureStandardization(outcome_model, weight_model, outcome_covariates=None, weight_covariates=None, feature_type='weight_vector')[source]
Bases:
causallib.estimation.doubly_robust.BaseDoublyRobust
A doubly-robust estimator of the effect of treatment. This model adds the weighting (inverse probability weighting) as additional feature to the outcome model.
References
Bang and Robins, https://doi.org/10.1111/j.1541-0420.2005.00377.x
Kang and Schafer, section 3.3, https://dx.doi.org/10.1214/07-STS227
- Parameters
outcome_model (IndividualOutcomeEstimator) – A causal model that estimate on individuals level
weight_model (WeightEstimator | PropensityEstimator) – A causal model for weighting individuals (e.g. IPW).
outcome_covariates (array) – Covariates to use for outcome model. If None - all covariates passed will be used. Either list of column names or boolean mask.
weight_covariates (array) – Covariates to use for weight model. If None - all covariates passed will be used. Either list of column names or boolean mask.
feature_type (str) –
the type of covariate to add. One of the following options: * “weight_vector”: uses a signed weight vector. Only defined for binary treatment.
For example, if weight_model is IPW then: 1/Pr[A=a_i|X] for each sample i. As described in Bang and Robins (2005).
”signed_weight_vector”: as ‘weight_vector’, but negates the weights of the control group. For example, if weight_model is IPW then: 1/Pr[A|X] for treated and 1/Pr[A|X] for controls. As described in the correction for Bang and Robins (2008)
- ”weight_matrix”: uses the entire weight matrix.
- For example, if weight_model is IPW then: 1/Pr[A_i=a|X_i=x],
for all treatment values a and for every sample i.
”masked_weighted_matrix”: uses the entire weight matrix, but masks it with a dummy-encoding of the treatment assignment. For example, if weight_model` is IPW then: 1/Pr[A=a_i|X=x_i] and 0 for all other a≠a_i columns. As described in Bang and Robins (2005).
- ”propensity_vector”: uses the probabilities for being in treatment group: Pr[A=1|X].
Better defined for binary treatment. Equivalent to Scharfstein, Rotnitzky, and Robins (1999) that use its inverse.
- ”logit_propensity_vector”: uses logit transformation of the propensity to treat Pr[A=1|X].
As described in Kang and Schafer (2007)
- ”propensity_matrix”: uses the probabilities for all treatment options,
Pr[A_i=a|X_i=x] for all treatment values a and samples i.
- estimate_individual_outcome(X, a, treatment_values=None, predict_proba=None)[source]
Estimates individual outcome under different treatment values (interventions)
- Parameters
X (pd.DataFrame) – Covariate matrix of size (num_subjects, num_features).
a (pd.Series) – Treatment assignment of size (num_subjects,).
treatment_values (Any) – Desired treatment value/s to use when estimating the counterfactual outcome/ If not supplied, calculates for all available treatment values.
predict_proba (bool | None) – In case the outcome task is classification and in case learner supports the operation, if True - prediction will utilize learner’s predict_proba or decision_function which returns a continuous matrix of size (n_samples, n_classes). If False - predict will be used and return value will be based on a vector of class classifications. If None - parameter is ignored and behaviour is as specified when initializing the IndividualOutcomeEstimator.
- Returns
- DataFrame which columns are treatment values and rows are individuals: each column is a vector
size (num_samples,) that contains the estimated outcome for each individual under the treatment value in the corresponding key.
- Return type
pd.DataFrame
- fit(X, a, y, refit_weight_model=True, **kwargs)[source]
Trains a causal model from observed data.
- Parameters
X (pd.DataFrame) – Covariate matrix of size (num_subjects, num_features).
a (pd.Series) – Treatment assignment of size (num_subjects,).
y (pd.Series) – Observed outcome of size (num_subjects,).
sample_weight – To be passed to the underlining scikit-learn’s fit method.
- Returns
A causal weight model with an inner learner fitted.
- Return type
- class causallib.estimation.doubly_robust.WeightedStandardization(outcome_model, weight_model, outcome_covariates=None, weight_covariates=None)[source]
Bases:
causallib.estimation.doubly_robust.BaseDoublyRobust
This model uses the weights from the weight-model (e.g. inverse probability weighting) as individual weights for fitting the outcome model.
References
Kang and Schafer, section 3.2, https://dx.doi.org/10.1214/07-STS227
- Parameters
outcome_model (IndividualOutcomeEstimator) – A causal model that estimate on individuals level (e.g. Standardization).
weight_model (WeightEstimator) – A causal model for weighting individuals (e.g. IPW).
outcome_covariates (array) – Covariates to use for outcome model. If None - all covariates passed will be used. Either list of column names or boolean mask.
weight_covariates (array) – Covariates to use for weight model. If None - all covariates passed will be used. Either list of column names or boolean mask.
- estimate_individual_outcome(X, a, treatment_values=None, predict_proba=None)[source]
Estimates individual outcome under different treatment values (interventions)
- Parameters
X (pd.DataFrame) – Covariate matrix of size (num_subjects, num_features).
a (pd.Series) – Treatment assignment of size (num_subjects,).
treatment_values (Any) – Desired treatment value/s to use when estimating the counterfactual outcome/ If not supplied, calculates for all available treatment values.
predict_proba (bool | None) – In case the outcome task is classification and in case learner supports the operation, if True - prediction will utilize learner’s predict_proba or decision_function which returns a continuous matrix of size (n_samples, n_classes). If False - predict will be used and return value will be based on a vector of class classifications. If None - parameter is ignored and behaviour is as specified when initializing the IndividualOutcomeEstimator.
- Returns
- DataFrame which columns are treatment values and rows are individuals: each column is a vector
size (num_samples,) that contains the estimated outcome for each individual under the treatment value in the corresponding key.
- Return type
pd.DataFrame
- fit(X, a, y, refit_weight_model=True, **kwargs)[source]
Trains a causal model from observed data.
- Parameters
X (pd.DataFrame) – Covariate matrix of size (num_subjects, num_features).
a (pd.Series) – Treatment assignment of size (num_subjects,).
y (pd.Series) – Observed outcome of size (num_subjects,).
sample_weight – To be passed to the underlining scikit-learn’s fit method.
- Returns
A causal weight model with an inner learner fitted.
- Return type