causallib.preprocessing.confounder_selection.RecursiveConfounderElimination#

class RecursiveConfounderElimination(estimator, n_features_to_select=1, step=1, importance_getter='auto', covariates=None)[source]#

Recursively eliminate confounders to prune confounders.

Parameters:
  • estimator – Estimator to fit for every step of recursive elimination.

  • n_features_to_select (int) – The number of confounders to keep.

  • step (int) – The number of confounders to eliminate in one iteration.

  • importance_getter (str | callable) – how to obtain feature importance. either a callable that inputs an estimator, a string of ‘coef_’ or ‘feature_importance_’, or ‘auto’ will detect ‘coef_’ or ‘feature_importance_’ automatically.

  • covariates (list | numpy.ndarray) – Specifying a subset of columns to perform selection on. Columns in X but not in covariates will be included after transform no matter the selection. Can be either a list of column names, or an array of boolean indicators length of X, or anything compatible with pandas loc function for columns. if None then all columns are participating in the selection process. This is similar to using sklearn’s ColumnTransformer or make_column_selector.

__init__(estimator, n_features_to_select=1, step=1, importance_getter='auto', covariates=None)[source]#

Recursively eliminate confounders to prune confounders.

Parameters:
  • estimator – Estimator to fit for every step of recursive elimination.

  • n_features_to_select (int) – The number of confounders to keep.

  • step (int) – The number of confounders to eliminate in one iteration.

  • importance_getter (str | callable) – how to obtain feature importance. either a callable that inputs an estimator, a string of ‘coef_’ or ‘feature_importance_’, or ‘auto’ will detect ‘coef_’ or ‘feature_importance_’ automatically.

  • covariates (list | numpy.ndarray) – Specifying a subset of columns to perform selection on. Columns in X but not in covariates will be included after transform no matter the selection. Can be either a list of column names, or an array of boolean indicators length of X, or anything compatible with pandas loc function for columns. if None then all columns are participating in the selection process. This is similar to using sklearn’s ColumnTransformer or make_column_selector.

fit(X, *args, **kwargs)[source]#