causallib.estimation.tmle module

class causallib.estimation.tmle.BaseCleverCovariate(weight_model)[source]

Bases: object

abstract clever_covariate_fit(X, a)[source]
abstract clever_covariate_inference(X, a, treatment_value)[source]
abstract sample_weights(X, a)[source]
class causallib.estimation.tmle.CleverCovariateFeatureMatrix(weight_model)[source]

Bases: causallib.estimation.tmle.CleverCovariateImportanceSamplingMatrix

Clever covariate uses a matrix of inverse propensity weights of all treatment values as a predictor to the targeting regression.

References

clever_covariate_fit(X, a)[source]
clever_covariate_inference(X, a, treatment_value)[source]
sample_weights(X, a)[source]
class causallib.estimation.tmle.CleverCovariateFeatureVector(weight_model)[source]

Bases: causallib.estimation.tmle.BaseCleverCovariate

Clever covariate uses a signed vector of inverse propensity weights, with control group have their weights negated. The vector is then used as a predictor to the targeting regression.

References

clever_covariate_fit(X, a)[source]
clever_covariate_inference(X, a, treatment_value)[source]
sample_weights(X, a)[source]
class causallib.estimation.tmle.CleverCovariateImportanceSamplingMatrix(weight_model)[source]

Bases: causallib.estimation.tmle.BaseCleverCovariate

Clever covariate of inverse propensity weight vector is used as weight for the targeting regression. The predictors are a one-hot (full dummy) encoding of the treatment assignment.

References

clever_covariate_fit(X, a)[source]
clever_covariate_inference(X, a, treatment_value)[source]
sample_weights(X, a)[source]
class causallib.estimation.tmle.CleverCovariateImportanceSamplingVector(weight_model)[source]

Bases: causallib.estimation.tmle.BaseCleverCovariate

Clever covariate of inverse propensity weight vector is used as weight for the targeting regression. The predictors are a signed vector with negative 1 for the control group.

clever_covariate_fit(X, a)[source]
clever_covariate_inference(X, a, treatment_value)[source]
sample_weights(X, a)[source]
class causallib.estimation.tmle.TMLE(outcome_model, weight_model, outcome_covariates=None, weight_covariates=None, reduced=False, importance_sampling=False, glm_fit_kwargs=None)[source]

Bases: causallib.estimation.doubly_robust.BaseDoublyRobust

Targeted Maximum Likelihood Estimation. A model that takes an outcome model that was optimized to predict E[Y|X,A], and “retargets” (“updates”) it to estimate E[Y^A|X] using a “clever covariate” constructed from the inverse propensity weights.

Steps:
  1. Fit an outcome model Y=Q(X,A).

  2. Fit a weight model A=g(X,A).

  3. Construct a clever covariate using g(X,A).

  4. Fit a logistic regression model Q* to predict Y using g(X,A) as features and Q(X,A) as offset.

  5. Predict counterfactual outcome for treatment value a Q*(X,a) by plugging in Q(X,a) as offset, g(X,a) as covariate.

Implements 4 flavours of TMLE controlled by the reduced and importance_sampling parameters. importance_sampling=True moves the clever covariate from being a feature to being a sample weight in the targeted regression. ‘reduced=True’ use a clever covariate vector of 1s and -1s, therefore only good for binary treatment. Otherwise, the clever covariate are the entire IPW matrix and can be used for multiple treatments.

References

Parameters
  • outcome_model (IndividualOutcomeEstimator) – An initial prediction of the outcome

  • weight_model (PropensityEstimator) – An IPW model predicting the treatment.

  • outcome_covariates (array) – Covariates to use for outcome model. If None - all covariates passed will be used. Either list of column names or boolean mask.

  • weight_covariates (array) – Covariates to use for weight model. If None - all covariates passed will be used. Either list of column names or boolean mask.

  • reduced (bool) – If True uses a vector version of the clever covariate (rather than a matrix of all treatment values). If True enforces a binary treatment assignment.

  • importance_sampling (bool) – If True moves the clever covariate from being a feature to being a weight in the regression.

  • glm_fit_kwargs (dict) – Additional kwargs for statsmodels’ GLM.fit(). Can be used for example for refining the optimizers. see: https://www.statsmodels.org/stable/generated/statsmodels.genmod.generalized_linear_model.GLM.fit.html

estimate_individual_outcome(X, a, treatment_values=None, predict_proba=None)[source]

Estimates individual outcome under different treatment values (interventions)

Parameters
  • X (pd.DataFrame) – Covariate matrix of size (num_subjects, num_features).

  • a (pd.Series) – Treatment assignment of size (num_subjects,).

  • treatment_values (Any) – Desired treatment value/s to use when estimating the counterfactual outcome/ If not supplied, calculates for all available treatment values.

  • predict_proba (bool | None) – In case the outcome task is classification and in case learner supports the operation, if True - prediction will utilize learner’s predict_proba or decision_function which returns a continuous matrix of size (n_samples, n_classes). If False - predict will be used and return value will be based on a vector of class classifications. If None - parameter is ignored and behaviour is as specified when initializing the IndividualOutcomeEstimator.

Returns

DataFrame which columns are treatment values and rows are individuals: each column is a vector

size (num_samples,) that contains the estimated outcome for each individual under the treatment value in the corresponding key.

Return type

pd.DataFrame

fit(X, a, y, refit_weight_model=True, **kwargs)[source]

Trains a causal model from observed data.

Parameters
  • X (pd.DataFrame) – Covariate matrix of size (num_subjects, num_features).

  • a (pd.Series) – Treatment assignment of size (num_subjects,).

  • y (pd.Series) – Observed outcome of size (num_subjects,).

  • sample_weight – To be passed to the underlining scikit-learn’s fit method.

Returns

A causal weight model with an inner learner fitted.

Return type

IndividualOutcomeEstimator

class causallib.estimation.tmle.TargetMinMaxScaler(feature_range=(0, 1), *, copy=True, clip=False)[source]

Bases: sklearn.preprocessing._data.MinMaxScaler

A MinMaxScaler that operates on a vector (Series)

fit(X, y=None)[source]

Compute the minimum and maximum to be used for later scaling.

Parameters
  • X (array-like of shape (n_samples, n_features)) – The data used to compute the per-feature minimum and maximum used for later scaling along the features axis.

  • y (None) – Ignored.

Returns

self – Fitted scaler.

Return type

object

inverse_transform(X)[source]

Undo the scaling of X according to feature_range.

Parameters

X (array-like of shape (n_samples, n_features)) – Input data that will be transformed. It cannot be sparse.

Returns

Xt – Transformed data.

Return type

ndarray of shape (n_samples, n_features)

transform(X)[source]

Scale features of X according to feature_range.

Parameters

X (array-like of shape (n_samples, n_features)) – Input data that will be transformed.

Returns

Xt – Transformed data.

Return type

ndarray of shape (n_samples, n_features)