causallib.estimation.doubly_robust module

  1. Copyright 2019 IBM Corp.

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Created on Apr 25, 2018

A module implementing several doubly-robust methods. These methods utilize causal standardization model and causal weight models and combine them to hopefully achieve a better model. The exact way to combine differs from different models and is described in the class docstring.

class causallib.estimation.doubly_robust.AIPW(outcome_model, weight_model, outcome_covariates=None, weight_covariates=None, overlap_weighting=False)[source]

Bases: causallib.estimation.doubly_robust.BaseDoublyRobust

Calculates a doubly-robust estimate of the treatment effect by performing potential-outcome prediction (outcome_model) and then correcting its prediction-residuals using re-weighting from a treatment model (weight_model, like IPW).

It has two flavors, which slightly change the weighting of the outcome model in the correction term. Let e(X) be the estimated propensity score and m(X, A) is the estimated effect by an estimator, then the individual estimates are:

m(X,1) + A*(Y-m(X,1))/e(X), and
m(X,0) + (1-A)*(Y-m(X,0))/(1-e(X))

Which are basically add IP-weighted residuals from the observed predictions. As described in Kang and Schafer (2007) section 3.1 and Robins, Rotnitzky, and Zhao (1994).

The additional flavor when overlap_weighting=True is from Glynn and Quinn (2010), adds weighting by the propensity-of-the-other-class to the outcome model, so extreme example (with poor covariate overlap) will contribute less to the correction (i.e. rely less on their prediction value that might extrapolate too much). This is a similar notion used in Overlap Weights model (hence the argument name)

A * [Y - (1-e(X))m(X,1)]/e(X) + (1-A) * m(X,1), and
(1-A) * [Y - e(X)m(X,0)]/(1-e(X)) + A * m(X,0)
Parameters
  • outcome_model (IndividualOutcomeEstimator) – A causal model that estimate on individuals level (e.g. Standardization).

  • weight_model (WeightEstimator | PropensityEstimator) – A causal model for weighting individuals (e.g. IPW). If overlap_weighting=True then must be a PropensityEstimator model.

  • outcome_covariates (array) – Covariates to use for outcome model. If None - all covariates passed will be used. Either list of column names or boolean mask.

  • weight_covariates (array) – Covariates to use for weight model. If None - all covariates passed will be used. Either list of column names or boolean mask.

  • overlap_weighting (bool) – Whether to tweak the outcome-model correction-term to rely less on data-points with poor covariate overlap (extreme propensity). if True, requires weight_model to be an instance of PropensityEstimator.

References

estimate_effect(outcome1, outcome2, agg='population', effect_types='diff')[source]

Estimates an effect given two potential outcomes.

Parameters
  • outcome1 (pd.Series) – A potential outcome.

  • outcome2 (pd.Series) – A potential outcome.

  • agg (str) – Either “population” or “individual” - whether to calculate individual effect or population effect.

  • effect_types (list[str] | str) – Any iterable of strings from the set of EffectEstimator.CALCULATE_EFFECT keys

Returns

A Series if population effect (input is scalar) with index are the effect types

and values are the corresponding computed effect. A DataFrame if individual effect (input is a vector) where columns are effects types and rows are effect in each individual. Always: Value type is the same as outcome_1 and outcome_2 type.

Return type

pd.Series | pd.DataFrame

estimate_individual_outcome(X, a, treatment_values=None, predict_proba=None)[source]

Estimates individual outcome under different treatment values (interventions).

Notes

This method utilizes only the standardization model behind the doubly-robust model. Namely, this is an uncorrected outcome (that does not incorporates the weighted observed outcome). To get a true doubly-robust estimation use the estimate_population_outcome, rather than an individual outcome.

Parameters
  • X (pd.DataFrame) – Covariate matrix of size (num_subjects, num_features).

  • a (pd.Series) – Treatment assignment of size (num_subjects,).

  • treatment_values (Any) – Desired treatment value/s to use when estimating the counterfactual outcome/ If not supplied, calculates for all available treatment values.

  • predict_proba (bool | None) – In case the outcome task is classification and in case learner supports the operation, if True - prediction will utilize learner’s predict_proba or decision_function which returns a continuous matrix of size (n_samples, n_classes). If False - predict will be used and return value will be based on a vector of class classifications.

Returns

DataFrame which columns are treatment values and rows are individuals: each column is a vector

size (num_samples,) that contains the estimated outcome for each individual under the treatment value in the corresponding key.

Return type

pd.DataFrame

estimate_population_outcome(X, a, y=None, treatment_values=None, predict_proba=None, agg_func='mean')[source]

Doubly-robust averaging, combining the individual counterfactual predictions from the standardization model and the weighted observed outcomes to estimate population outcome for each treatment subgroup.

Parameters
  • X (pd.DataFrame) – Covariate matrix of size (num_subjects, num_features).

  • a (pd.Series) – Treatment assignment of size (num_subjects,).

  • y (pd.Series) – Observed outcome of size (num_subjects,).

  • treatment_values (Any) – Desired treatment value/s to stratify upon before aggregating individual into population outcome. If not supplied, calculates for all available treatment values.

  • predict_proba (bool | None) – To be used when provoking estimate_individual_outcome. In case the outcome task is classification and in case learner supports the operation, if True - prediction will utilize learner’s predict_proba or decision_function which returns a continuous matrix of size (n_samples, n_classes). If False - predict will be used and return value will be based on a vector of class classifications.

  • agg_func (str) – Type of aggregation function (e.g. “mean” or “median”).

Returns

Series which index are treatment values, and the values are numbers - the aggregated outcome for

the strata of people whose assigned treatment is the key.

Return type

pd.Series

fit(X, a, y, refit_weight_model=True, **kwargs)[source]

Trains a causal model from observed data.

Parameters
  • X (pd.DataFrame) – Covariate matrix of size (num_subjects, num_features).

  • a (pd.Series) – Treatment assignment of size (num_subjects,).

  • y (pd.Series) – Observed outcome of size (num_subjects,).

  • sample_weight – To be passed to the underlining scikit-learn’s fit method.

Returns

A causal weight model with an inner learner fitted.

Return type

IndividualOutcomeEstimator

class causallib.estimation.doubly_robust.BaseDoublyRobust(outcome_model, weight_model, outcome_covariates=None, weight_covariates=None)[source]

Bases: causallib.estimation.base_estimator.IndividualOutcomeEstimator

Abstract class defining the interface and general initialization of specific doubly-robust methods.

Parameters
  • outcome_model (IndividualOutcomeEstimator) – A causal model that estimate on individuals level (e.g. Standardization).

  • weight_model (WeightEstimator) – A causal model for weighting individuals (e.g. IPW).

  • outcome_covariates (array) – Covariates to use for outcome model. If None - all covariates passed will be used. Either list of column names or boolean mask.

  • weight_covariates (array) – Covariates to use for weight model. If None - all covariates passed will be used. Either list of column names or boolean mask.

abstract fit(X, a, y, refit_weight_model=True, **kwargs)[source]

Trains a causal model from observed data.

Parameters
  • X (pd.DataFrame) – Covariate matrix of size (num_subjects, num_features).

  • a (pd.Series) – Treatment assignment of size (num_subjects,).

  • y (pd.Series) – Observed outcome of size (num_subjects,).

  • sample_weight – To be passed to the underlining scikit-learn’s fit method.

Returns

A causal weight model with an inner learner fitted.

Return type

IndividualOutcomeEstimator

class causallib.estimation.doubly_robust.PropensityFeatureStandardization(outcome_model, weight_model, outcome_covariates=None, weight_covariates=None, feature_type='weight_vector')[source]

Bases: causallib.estimation.doubly_robust.BaseDoublyRobust

A doubly-robust estimator of the effect of treatment. This model adds the weighting (inverse probability weighting) as additional feature to the outcome model.

References

Parameters
  • outcome_model (IndividualOutcomeEstimator) – A causal model that estimate on individuals level

  • weight_model (WeightEstimator | PropensityEstimator) – A causal model for weighting individuals (e.g. IPW).

  • outcome_covariates (array) – Covariates to use for outcome model. If None - all covariates passed will be used. Either list of column names or boolean mask.

  • weight_covariates (array) – Covariates to use for weight model. If None - all covariates passed will be used. Either list of column names or boolean mask.

  • feature_type (str) –

    the type of covariate to add. One of the following options: * “weight_vector”: uses a signed weight vector. Only defined for binary treatment.

    For example, if weight_model is IPW then: 1/Pr[A=a_i|X] for each sample i. As described in Bang and Robins (2005).

    • ”signed_weight_vector”: as ‘weight_vector’, but negates the weights of the control group. For example, if weight_model is IPW then: 1/Pr[A|X] for treated and 1/Pr[A|X] for controls. As described in the correction for Bang and Robins (2008)

    • ”weight_matrix”: uses the entire weight matrix.
      For example, if weight_model is IPW then: 1/Pr[A_i=a|X_i=x],

      for all treatment values a and for every sample i.

    • ”masked_weighted_matrix”: uses the entire weight matrix, but masks it with a dummy-encoding of the treatment assignment. For example, if weight_model` is IPW then: 1/Pr[A=a_i|X=x_i] and 0 for all other a≠a_i columns. As described in Bang and Robins (2005).

    • ”propensity_vector”: uses the probabilities for being in treatment group: Pr[A=1|X].

      Better defined for binary treatment. Equivalent to Scharfstein, Rotnitzky, and Robins (1999) that use its inverse.

    • ”logit_propensity_vector”: uses logit transformation of the propensity to treat Pr[A=1|X].

      As described in Kang and Schafer (2007)

    • ”propensity_matrix”: uses the probabilities for all treatment options,

      Pr[A_i=a|X_i=x] for all treatment values a and samples i.

estimate_individual_outcome(X, a, treatment_values=None, predict_proba=None)[source]

Estimates individual outcome under different treatment values (interventions)

Parameters
  • X (pd.DataFrame) – Covariate matrix of size (num_subjects, num_features).

  • a (pd.Series) – Treatment assignment of size (num_subjects,).

  • treatment_values (Any) – Desired treatment value/s to use when estimating the counterfactual outcome/ If not supplied, calculates for all available treatment values.

  • predict_proba (bool | None) – In case the outcome task is classification and in case learner supports the operation, if True - prediction will utilize learner’s predict_proba or decision_function which returns a continuous matrix of size (n_samples, n_classes). If False - predict will be used and return value will be based on a vector of class classifications. If None - parameter is ignored and behaviour is as specified when initializing the IndividualOutcomeEstimator.

Returns

DataFrame which columns are treatment values and rows are individuals: each column is a vector

size (num_samples,) that contains the estimated outcome for each individual under the treatment value in the corresponding key.

Return type

pd.DataFrame

fit(X, a, y, refit_weight_model=True, **kwargs)[source]

Trains a causal model from observed data.

Parameters
  • X (pd.DataFrame) – Covariate matrix of size (num_subjects, num_features).

  • a (pd.Series) – Treatment assignment of size (num_subjects,).

  • y (pd.Series) – Observed outcome of size (num_subjects,).

  • sample_weight – To be passed to the underlining scikit-learn’s fit method.

Returns

A causal weight model with an inner learner fitted.

Return type

IndividualOutcomeEstimator

class causallib.estimation.doubly_robust.WeightedStandardization(outcome_model, weight_model, outcome_covariates=None, weight_covariates=None)[source]

Bases: causallib.estimation.doubly_robust.BaseDoublyRobust

This model uses the weights from the weight-model (e.g. inverse probability weighting) as individual weights for fitting the outcome model.

References

Parameters
  • outcome_model (IndividualOutcomeEstimator) – A causal model that estimate on individuals level (e.g. Standardization).

  • weight_model (WeightEstimator) – A causal model for weighting individuals (e.g. IPW).

  • outcome_covariates (array) – Covariates to use for outcome model. If None - all covariates passed will be used. Either list of column names or boolean mask.

  • weight_covariates (array) – Covariates to use for weight model. If None - all covariates passed will be used. Either list of column names or boolean mask.

estimate_individual_outcome(X, a, treatment_values=None, predict_proba=None)[source]

Estimates individual outcome under different treatment values (interventions)

Parameters
  • X (pd.DataFrame) – Covariate matrix of size (num_subjects, num_features).

  • a (pd.Series) – Treatment assignment of size (num_subjects,).

  • treatment_values (Any) – Desired treatment value/s to use when estimating the counterfactual outcome/ If not supplied, calculates for all available treatment values.

  • predict_proba (bool | None) – In case the outcome task is classification and in case learner supports the operation, if True - prediction will utilize learner’s predict_proba or decision_function which returns a continuous matrix of size (n_samples, n_classes). If False - predict will be used and return value will be based on a vector of class classifications. If None - parameter is ignored and behaviour is as specified when initializing the IndividualOutcomeEstimator.

Returns

DataFrame which columns are treatment values and rows are individuals: each column is a vector

size (num_samples,) that contains the estimated outcome for each individual under the treatment value in the corresponding key.

Return type

pd.DataFrame

fit(X, a, y, refit_weight_model=True, **kwargs)[source]

Trains a causal model from observed data.

Parameters
  • X (pd.DataFrame) – Covariate matrix of size (num_subjects, num_features).

  • a (pd.Series) – Treatment assignment of size (num_subjects,).

  • y (pd.Series) – Observed outcome of size (num_subjects,).

  • sample_weight – To be passed to the underlining scikit-learn’s fit method.

Returns

A causal weight model with an inner learner fitted.

Return type

IndividualOutcomeEstimator