causallib.estimation.standardization module

  1. Copyright 2019 IBM Corp.

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Created on Apr 25, 2018

class causallib.estimation.standardization.Standardization(learner, encode_treatment=False, predict_proba=False)[source]

Bases: causallib.estimation.base_estimator.IndividualOutcomeEstimator

Standard standardization model for causal inference. Learns a model that takes into account the treatment assignment, and later, this value can be intervened, changing the predicted outcome.

Parameters
  • learner – Initialized sklearn model.

  • encode_treatment (bool) – Whether to encode the treatment as one-hot matrix. Usually good if n_treatment > 2.

  • predict_proba (bool) – In case the outcome task is classification and in case learner supports the operation, if True - prediction will utilize learner’s predict_proba or decision_function which returns a continuous matrix of size (n_samples, n_classes). If False - predict will be used and return value will be based on a vector of class classifications.

estimate_individual_outcome(X, a, treatment_values=None, predict_proba=None)[source]

Estimates individual outcome under different treatment values (interventions)

Parameters
  • X (pd.DataFrame) – Covariate matrix of size (num_subjects, num_features).

  • a (pd.Series) – Treatment assignment of size (num_subjects,).

  • treatment_values (Any) – Desired treatment value/s to use when estimating the counterfactual outcome/ If not supplied, calculates for all available treatment values.

  • predict_proba (bool | None) – In case the outcome task is classification and in case learner supports the operation, if True - prediction will utilize learner’s predict_proba or decision_function which returns a continuous matrix of size (n_samples, n_classes). If False - predict will be used and return value will be based on a vector of class classifications. If None - parameter is ignored and behaviour is as specified when initializing the IndividualOutcomeEstimator.

Returns

DataFrame which columns are treatment values and rows are individuals: each column is a vector

size (num_samples,) that contains the estimated outcome for each individual under the treatment value in the corresponding key.

Return type

pd.DataFrame

fit(X, a, y, sample_weight=None)[source]

Trains a causal model from observed data.

Parameters
  • X (pd.DataFrame) – Covariate matrix of size (num_subjects, num_features).

  • a (pd.Series) – Treatment assignment of size (num_subjects,).

  • y (pd.Series) – Observed outcome of size (num_subjects,).

  • sample_weight – To be passed to the underlining scikit-learn’s fit method.

Returns

A causal weight model with an inner learner fitted.

Return type

IndividualOutcomeEstimator

class causallib.estimation.standardization.StratifiedStandardization(learner, treatment_values=None, predict_proba=False)[source]

Bases: causallib.estimation.base_estimator.IndividualOutcomeEstimator

Standardization model that learns a model for each treatment group (i.e. subgroup of subjects with the same treatment assignment).

Parameters
  • learner – Initialized sklearn model or a mapping (dict) between treatment value and initialized model, For example: {0: Ridge(alpha=5), 1: Ridge(alpha=0.1)}, or even different models all over: {0: Ridge(), 1: RandomForestRegressor} Make sure these treatment_values keys represent all treatment values found in later use.

  • treatment_values (list) – list of unique values of treatment (can be a single value as well). If known beforehand (on initialization time), can be passed now to init, otherwise would be inferred during fit (where treatment assignment must be supplied). Make sure these treatment_values represent all treatment values found in later use.

  • predict_proba (bool) – In case the outcome task is classification and in case learner supports the operation, if True - prediction will utilize learner’s predict_proba or decision_function which returns a continuous matrix of size (n_samples, n_classes). If False - predict will be used and return value will be based on a vector of class classifications.

estimate_individual_outcome(X, a, treatment_values=None, predict_proba=None)[source]

Estimates individual outcome under different treatment values (interventions)

Parameters
  • X (pd.DataFrame) – Covariate matrix of size (num_subjects, num_features).

  • a (pd.Series) – Treatment assignment of size (num_subjects,).

  • treatment_values (Any) – Desired treatment value/s to use when estimating the counterfactual outcome/ If not supplied, calculates for all available treatment values.

  • predict_proba (bool | None) – In case the outcome task is classification and in case learner supports the operation, if True - prediction will utilize learner’s predict_proba or decision_function which returns a continuous matrix of size (n_samples, n_classes). If False - predict will be used and return value will be based on a vector of class classifications. If None - parameter is ignored and behaviour is as specified when initializing the IndividualOutcomeEstimator.

Returns

DataFrame which columns are treatment values and rows are individuals: each column is a vector

size (num_samples,) that contains the estimated outcome for each individual under the treatment value in the corresponding key.

Return type

pd.DataFrame

fit(X, a, y, sample_weight=None)[source]

Trains a causal model from observed data.

Parameters
  • X (pd.DataFrame) – Covariate matrix of size (num_subjects, num_features).

  • a (pd.Series) – Treatment assignment of size (num_subjects,).

  • y (pd.Series) – Observed outcome of size (num_subjects,).

  • sample_weight – To be passed to the underlining scikit-learn’s fit method.

Returns

A causal weight model with an inner learner fitted.

Return type

IndividualOutcomeEstimator