causallib.estimation.xlearner module

Copyright 2019 IBM Corp.

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Created on Sep 9, 2021

class causallib.estimation.xlearner.XLearner(outcome_model, effect_model=None, treatment_model=None, predict_proba=True, effect_types='diff')[source]

Bases: causallib.estimation.base_estimator.IndividualOutcomeEstimator

An X-learner model for causal inference (künzel et al. 2018. pnas, https://www.pnas.org/content/116/10/4156). Uses two outcome estimators. The first is used to calculate the response while the second is used invertly to calculate the treatment which is averaged according to the propensity of the treatment assignment.

Parameters

outcome_model (IndividualOutcomeEstimator) –
Initialized causallib estimator that will be used to predict the outcome of each treatment given a case and a certain. To adhere

to the XLearner algorithm a StratifiedStandardization object should be used for both outcome and cate model initialized with comparable sklearn learners. Xlearner algorithm is suitable for a binary outcome, if a non binary outcome will be used the class will view the last outcome versus the rest as the binary outcome.
effect_model (IndividualOutcomeEstimator | None) – Initialized causallib estimator that will be used to predict the treatment effect of each case. The treatment effect is estimated on the observed set using the outcome model if the treatment effect is continuous use a regression model. The default estimator is cloned from the outcome model. The cloning is done after the outcome model is fitted to enable warm start of the cate model by the outcome model if outcome_model has its warm_start attribute on.
treatment_model – Initialized sklearn prediction model that will predict the probability of each treatment. Xlearner algorithm is suitable for binary treatment.
predict_proba (bool) –

In case the outcome task is classification and in case learner supports the
operation, if True - prediction will utilize learner’s predict_proba or decision_function which returns a continuous matrix of size (n_samples, n_classes). If False - predict will be used and return value will be based on a vector of class classifications. Xlearner effect estimation (in the case of binary effect)

requires the outcome estimator to predict probabilities of classification (predict_proba=True)
effect_types (str) – string from the set of EffectEstimator.CALCULATE_EFFECT keys if none the sklearn DummyClassifier with prior strategy will be used.

estimate_effect(X, a, agg='population', predict_proba=None, effect_types=None)[source]

Estimates the causal effect between treatment groups.

Parameters

X (pd.DataFrame) – Covariates to predict on.
a (pd.Series) – Corresponding treatment assignment to utilize for prediction. Assumes treated group is coded as 1, and control group as 0.
agg (str) – Either “population” or “individual” - whether to calculate individual effect or population effect.
predict_proba (bool | None) – In case the outcome task is classification and in case learner supports the operation, if True - prediction will utilize learner’s predict_proba or decision_function which returns a continuous matrix of size (n_samples, n_classes). If False - predict will be used and return value will be based on a vector of class classifications. If None, will use the object’s initialized predict_proba value
effect_types (None) – IGNORED

Returns

the estimated causal effect

Return type

pd.Series

estimate_individual_outcome(X, a, treatment_values=None, predict_proba=None)[source]

Estimates individual outcome under different treatment values (interventions)

Parameters

X (pd.DataFrame) – Covariate matrix of size (num_subjects, num_features).
a (pd.Series) – Treatment assignment of size (num_subjects,).
treatment_values (Any) – Desired treatment value/s to use when estimating the counterfactual outcome/ If not supplied, calculates for all available treatment values.
predict_proba (bool | None) – In case the outcome task is classification and in case learner supports the operation, if True - prediction will utilize learner’s predict_proba or decision_function which returns a continuous matrix of size (n_samples, n_classes). If False - predict will be used and return value will be based on a vector of class classifications. If None - parameter is ignored and behaviour is as specified when initializing the IndividualOutcomeEstimator.

Returns

DataFrame which columns are treatment values and rows are individuals: each column is a vector: size (num_samples,) that contains the estimated outcome for each individual under the treatment value in the corresponding key.

Return type

pd.DataFrame

fit(X, a, y, sample_weight=None, predict_proba=None)[source]

Trains a causal model from observed data.

Parameters

X (pd.DataFrame) – Covariate matrix of size (num_subjects, num_features).
a (pd.Series) – Treatment assignment of size (num_subjects,).
y (pd.Series) – Observed outcome of size (num_subjects,).
sample_weight – To be passed to the underlining outcome model fit method.
predict_proba (bool | None) – In case the outcome task is classification and in case learner supports the operation, if True - prediction will utilize learner’s predict_proba or decision_function which returns a continuous matrix of size (n_samples, n_classes). If False - predict will be used and return value will be based on a vector of class classifications. If None, will use the object’s initialized predict_proba value

Returns

A causal model with an inner models fitted.

Return type

IndividualOutcomeEstimator