causallib.estimation.OverlapWeights#

class OverlapWeights(learner, use_stabilized=False)[source]#

Implementation of overlap (propensity score) weighting:

https://www.tandfonline.com/doi/full/10.1080/01621459.2016.1260466

A method to balance observed covariates between treatment groups in observational studies. Down-weigh observations with extreme propensity and weigh up Put less importance to observations with extreme propensity scores, and put more emphasis on observations with a central tendency towards (i.e. overlapping propensity scores).

Each unit’s weight is proportional to the probability of that unit being assigned to the opposite group: w_i = 1 - Pr[A=a_i|Xi]

This method assumes only two treatment groups exist.

Parameters:

learner – Initialized sklearn model.
use_stabilized (bool) – Whether to re-weigh the learned weights with the prevalence of the treatment. See Also: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4351790/#S6title

__init__(learner, use_stabilized=False)[source]#

Implementation of overlap (propensity score) weighting:

https://www.tandfonline.com/doi/full/10.1080/01621459.2016.1260466

A method to balance observed covariates between treatment groups in observational studies. Down-weigh observations with extreme propensity and weigh up Put less importance to observations with extreme propensity scores, and put more emphasis on observations with a central tendency towards (i.e. overlapping propensity scores).

Each unit’s weight is proportional to the probability of that unit being assigned to the opposite group: w_i = 1 - Pr[A=a_i|Xi]

This method assumes only two treatment groups exist.

Parameters:

learner – Initialized sklearn model.
use_stabilized (bool) – Whether to re-weigh the learned weights with the prevalence of the treatment. See Also: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4351790/#S6title

compute_weight_matrix(X, a, clip_min=None, clip_max=None, use_stabilized=None)[source]#

Computes individual weight across all possible treatment values. w_ij = 1 - Pr[A=a_j | X_i] for all individual i and treatment j.

Parameters:

X (pandas.DataFrame) – Covariate matrix of size (num_subjects, num_features).
a (pandas.Series) – Treatment assignment of size (num_subjects,).
clip_min (None|float) – Lower bound for propensity scores. Better be left None.
clip_max (None|float) – Upper bound for propensity scores. Better be left None.
use_stabilized (None|bool) – Whether to re-weigh the learned weights with the prevalence of the treatment. This overrides the use_stabilized parameter provided at initialization. If True provided, but the model was initialized with use_stabilized=False, then prevalence is calculated from data at hand, rather than the prevalence from the training data. See Also: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4351790/#S6title

Returns:

A matrix of size (num_subjects, num_treatments) with weight for every individual and every: treatment.

Return type:

pandas.DataFrame

stabilize_weights(a, weight_matrix, use_stabilized=False)[source]#

Adjust sample weights according to class prevalence: Pr[A=a_i] * w_i

Parameters:

weight_matrix (pandas.DataFrame) – Covariate matrix of size (num_subjects, num_features).
use_stabilized (None|bool) – Whether to re-weigh the learned weights with the prevalence of the treatment. This overrides the use_stabilized parameter provided at initialization. If True provided, but the model was initialized with use_stabilized=False, then prevalence is calculated from data at hand, rather than the prevalence from the training data. See Also: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4351790/#S6title

Returns:

A matrix of size (num_subjects, num_treatments) with stabilized (if True): weight for every individual and every treatment.

Return type:

pandas.DataFrame

set_fit_request(*, a='$UNCHANGED$')#

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:: a (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for a parameter in fit.
Returns:: self – The updated object.
Return type:: object