causallib.evaluation.evaluator module

Methods for evaluating causal inference models.

Copyright 2019 IBM Corp.

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Created on Dec 25, 2018

causallib.evaluation.evaluator.evaluate(estimator, X, a, y, cv=None, metrics_to_evaluate='defaults', plots=False)[source]

Evaluate model in cross-validation of the provided data

Parameters

| (estimator (causallib.estimation.base_estimator.IndividualOutcomeEstimator) – causallib.estimation.base_weight.WeightEstimator | causallib.estimation.base_weight.PropensityEstimator) : an estimator. If using cv, it will be refit, otherwise it should already be fit.
X (pd.DataFrame) – Covariates.
a (pd.Series) – Treatment assignment.
y (pd.Series) – Outcome.
cv (list[tuples] | generator[tuples] | None) – list the number of folds containing tuples of indices (train_idx, validation_idx) in an iloc manner (row number). If None, there will be no cross-validation. If cv=”auto”, a stratified Kfold with 5 folds will be created and used for cross-validation.
metrics_to_evaluate (dict | "defaults" | None) – key: metric’s name, value: callable that receives true labels, prediction, and sample_weights (the latter may be ignored). If “defaults”, default metrics are selected. If None, no metrics are evaluated.
plots (bool) – whether to generate plots

Returns

EvaluationResults

causallib.evaluation.evaluator.evaluate_bootstrap(estimator, X, a, y, n_bootstrap, n_samples=None, replace=True, refit=False, metrics_to_evaluate=None)[source]

Evaluate model on a bootstrap sample of the provided data

Parameters

X (pd.DataFrame) – Covariates.
a (pd.Series) – Treatment assignment.
y (pd.Series) – Outcome.
n_bootstrap (int) – Number of bootstrap sample to create.
n_samples (int | None) – Number of samples to sample in each bootstrap sampling. If None - will use the number samples (first dimension) of the data.
replace (bool) – Whether to use sampling with replacements. If False - n_samples (if provided) should be smaller than X.shape[0])
refit (bool) – Whether to refit the estimator on each bootstrap sample. Can be computational intensive if n_bootstrap is large.
metrics_to_evaluate (dict | None) – key: metric’s name, value: callable that receives true labels, prediction and sample_weights (the latter is allowed to be ignored). If not provided, default from causallib.evaluation.metrics are used.

Returns

EvaluationResults