causallib.contrib.shared_sparsity_selection.shared_sparsity_selection module

class causallib.contrib.shared_sparsity_selection.shared_sparsity_selection.SharedSparsityConfounderSelection(mcp_lambda='auto', mcp_alpha=1, step=0.1, max_iter=1000, tol=0.001, threshold=1e-06, importance_getter=None, covariates=None)[source]

Bases: causallib.preprocessing.confounder_selection._BaseConfounderSelection

Class to select confounders by first applying shared sparsity method. Method by Greenewald, Katz-Rogozhnikov, and Shanmugam: https://arxiv.org/abs/2011.01979

Constructor for SharedSparsityConfounderSelection

Parameters
  • mcp_lambda (str|float) – Parameter (>= 0) to control shape of MCP regularizer. The bigger the value the stronger the regularization. “auto” will auto-select good regularization value.

  • mcp_alpha (float) – Associated lambda parameter (>= 0) to control shape of MCP regularizer. The smaller the value the stronger the regularization.

  • step (float) – Step size for proximal gradient, equivalent of learning rate.

  • max_iter (int) – Maximum number of iterations of MCP proximal gradient.

  • tol (float) – Stopping criterion for MCP. If the normalized value of proximal gradient is less than tol then the algorithm is assumed to have converged.

  • threshold (float) – Only if the importance of a confounder exceeds threshold for all values of treatments, then the confounder is retained by transform() call.

  • importance_getter – IGNORED.

  • covariates (list | np.ndarray) – Specifying a subset of columns to perform selection on. Columns in X but not in covariates will be included after transform no matter the selection. Can be either a list of column names, or an array of boolean indicators length of X, or anything compatible with pandas loc function for columns. if None then all columns are participating in the selection process. This is similar to using sklearn’s ColumnTransformer or make_column_selector.

fit(X, *args, **kwargs)