causallib.estimation.xlearner.XLearner#

class XLearner(outcome_model, effect_model=None, treatment_model=None, predict_proba=True, effect_types='diff')[source]#

An X-learner model for causal inference (künzel et al. 2018. pnas, https://www.pnas.org/content/116/10/4156). Uses two outcome estimators. The first is used to calculate the response while the second is used invertly to calculate the treatment which is averaged according to the propensity of the treatment assignment.

Parameters:
  • outcome_model (IndividualOutcomeEstimator) –

    Initialized causallib estimator that will be used to predict the outcome of each treatment given a case and a certain. To adhere

    to the XLearner algorithm a StratifiedStandardization object should be used for both outcome and cate model initialized with comparable sklearn learners. Xlearner algorithm is suitable for a binary outcome, if a non binary outcome will be used the class will view the last outcome versus the rest as the binary outcome.

  • effect_model (IndividualOutcomeEstimator | None) – Initialized causallib estimator that will be used to predict the treatment effect of each case. The treatment effect is estimated on the observed set using the outcome model if the treatment effect is continuous use a regression model. The default estimator is cloned from the outcome model. The cloning is done after the outcome model is fitted to enable warm start of the cate model by the outcome model if outcome_model has its warm_start attribute on.

  • treatment_model – Initialized sklearn prediction model that will predict the probability of each treatment. Xlearner algorithm is suitable for binary treatment.

  • predict_proba (bool) –

    In case the outcome task is classification and in case learner supports the

    operation, if True - prediction will utilize learner’s predict_proba or decision_function which returns a continuous matrix of size (n_samples, n_classes). If False - predict will be used and return value will be based on a vector of class classifications. Xlearner effect estimation (in the case of binary effect)

    requires the outcome estimator to predict probabilities of classification (predict_proba=True)

  • effect_types (str) – string from the set of EffectEstimator.CALCULATE_EFFECT keys if none the sklearn DummyClassifier with prior strategy will be used.

__init__(outcome_model, effect_model=None, treatment_model=None, predict_proba=True, effect_types='diff')[source]#
Parameters:
  • outcome_model (IndividualOutcomeEstimator) –

    Initialized causallib estimator that will be used to predict the outcome of each treatment given a case and a certain. To adhere

    to the XLearner algorithm a StratifiedStandardization object should be used for both outcome and cate model initialized with comparable sklearn learners. Xlearner algorithm is suitable for a binary outcome, if a non binary outcome will be used the class will view the last outcome versus the rest as the binary outcome.

  • effect_model (IndividualOutcomeEstimator | None) – Initialized causallib estimator that will be used to predict the treatment effect of each case. The treatment effect is estimated on the observed set using the outcome model if the treatment effect is continuous use a regression model. The default estimator is cloned from the outcome model. The cloning is done after the outcome model is fitted to enable warm start of the cate model by the outcome model if outcome_model has its warm_start attribute on.

  • treatment_model – Initialized sklearn prediction model that will predict the probability of each treatment. Xlearner algorithm is suitable for binary treatment.

  • predict_proba (bool) –

    In case the outcome task is classification and in case learner supports the

    operation, if True - prediction will utilize learner’s predict_proba or decision_function which returns a continuous matrix of size (n_samples, n_classes). If False - predict will be used and return value will be based on a vector of class classifications. Xlearner effect estimation (in the case of binary effect)

    requires the outcome estimator to predict probabilities of classification (predict_proba=True)

  • effect_types (str) – string from the set of EffectEstimator.CALCULATE_EFFECT keys if none the sklearn DummyClassifier with prior strategy will be used.

estimate_effect(X, a, agg='population', predict_proba=None, effect_types=None)[source]#

Estimates the causal effect between treatment groups.

Parameters:
  • X (pandas.DataFrame) – Covariates to predict on.

  • a (pandas.Series) – Corresponding treatment assignment to utilize for prediction. Assumes treated group is coded as 1, and control group as 0.

  • agg (str) – Either “population” or “individual” - whether to calculate individual effect or population effect.

  • predict_proba (bool | None) – In case the outcome task is classification and in case learner supports the operation, if True - prediction will utilize learner’s predict_proba or decision_function which returns a continuous matrix of size (n_samples, n_classes). If False - predict will be used and return value will be based on a vector of class classifications. If None, will use the object’s initialized predict_proba value

  • effect_types (None) – IGNORED

Returns:

the estimated causal effect

Return type:

pandas.Series

estimate_individual_outcome(X, a, treatment_values=None, predict_proba=None)[source]#

Estimates individual outcome under different treatment values (interventions)

Parameters:
  • X (pandas.DataFrame) – Covariate matrix of size (num_subjects, num_features).

  • a (pandas.Series) – Treatment assignment of size (num_subjects,).

  • treatment_values (Any) – Desired treatment value/s to use when estimating the counterfactual outcome/ If not supplied, calculates for all available treatment values.

  • predict_proba (bool | None) – In case the outcome task is classification and in case learner supports the operation, if True - prediction will utilize learner’s predict_proba or decision_function which returns a continuous matrix of size (n_samples, n_classes). If False - predict will be used and return value will be based on a vector of class classifications. If None - parameter is ignored and behaviour is as specified when initializing the IndividualOutcomeEstimator.

Returns:

DataFrame which columns are treatment values and rows are individuals: each column is a vector

size (num_samples,) that contains the estimated outcome for each individual under the treatment value in the corresponding key.

Return type:

pandas.DataFrame

fit(X, a, y, sample_weight=None, predict_proba=None)[source]#

Trains a causal model from observed data.

Parameters:
  • X (pandas.DataFrame) – Covariate matrix of size (num_subjects, num_features).

  • a (pandas.Series) – Treatment assignment of size (num_subjects,).

  • y (pandas.Series) – Observed outcome of size (num_subjects,).

  • sample_weight – To be passed to the underlining outcome model fit method.

  • predict_proba (bool | None) – In case the outcome task is classification and in case learner supports the operation, if True - prediction will utilize learner’s predict_proba or decision_function which returns a continuous matrix of size (n_samples, n_classes). If False - predict will be used and return value will be based on a vector of class classifications. If None, will use the object’s initialized predict_proba value

Returns:

A causal model with an inner models fitted.

Return type:

IndividualOutcomeEstimator

set_fit_request(*, a='$UNCHANGED$', predict_proba='$UNCHANGED$', sample_weight='$UNCHANGED$')#

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • a (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for a parameter in fit.

  • predict_proba (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for predict_proba parameter in fit.

  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.

Returns:

self – The updated object.

Return type:

object