causallib.estimation.OverlapWeights#
- class OverlapWeights(learner, use_stabilized=False)[source]#
Implementation of overlap (propensity score) weighting:
https://www.tandfonline.com/doi/full/10.1080/01621459.2016.1260466
A method to balance observed covariates between treatment groups in observational studies. Down-weigh observations with extreme propensity and weigh up Put less importance to observations with extreme propensity scores, and put more emphasis on observations with a central tendency towards (i.e. overlapping propensity scores).
Each unit’s weight is proportional to the probability of that unit being assigned to the opposite group: w_i = 1 - Pr[A=a_i|Xi]
This method assumes only two treatment groups exist.
- Parameters:
learner – Initialized sklearn model.
use_stabilized (
bool) – Whether to re-weigh the learned weights with the prevalence of the treatment. See Also: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4351790/#S6title
- __init__(learner, use_stabilized=False)[source]#
Implementation of overlap (propensity score) weighting:
https://www.tandfonline.com/doi/full/10.1080/01621459.2016.1260466
A method to balance observed covariates between treatment groups in observational studies. Down-weigh observations with extreme propensity and weigh up Put less importance to observations with extreme propensity scores, and put more emphasis on observations with a central tendency towards (i.e. overlapping propensity scores).
Each unit’s weight is proportional to the probability of that unit being assigned to the opposite group: w_i = 1 - Pr[A=a_i|Xi]
This method assumes only two treatment groups exist.
- Parameters:
learner – Initialized sklearn model.
use_stabilized (
bool) – Whether to re-weigh the learned weights with the prevalence of the treatment. See Also: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4351790/#S6title
- compute_weight_matrix(X, a, clip_min=None, clip_max=None, use_stabilized=None)[source]#
Computes individual weight across all possible treatment values. w_ij = 1 - Pr[A=a_j | X_i] for all individual i and treatment j.
- Parameters:
X (
pandas.DataFrame) – Covariate matrix of size (num_subjects, num_features).a (
pandas.Series) – Treatment assignment of size (num_subjects,).clip_min (
None|float) – Lower bound for propensity scores. Better be left None.clip_max (
None|float) – Upper bound for propensity scores. Better be left None.use_stabilized (
None|bool) – Whether to re-weigh the learned weights with the prevalence of the treatment. This overrides the use_stabilized parameter provided at initialization. If True provided, but the model was initialized with use_stabilized=False, then prevalence is calculated from data at hand, rather than the prevalence from the training data. See Also: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4351790/#S6title
- Returns:
- A matrix of size (num_subjects, num_treatments) with weight for every individual and every
treatment.
- Return type:
- stabilize_weights(a, weight_matrix, use_stabilized=False)[source]#
Adjust sample weights according to class prevalence: Pr[A=a_i] * w_i
- Parameters:
weight_matrix (
pandas.DataFrame) – Covariate matrix of size (num_subjects, num_features).use_stabilized (
None|bool) – Whether to re-weigh the learned weights with the prevalence of the treatment. This overrides the use_stabilized parameter provided at initialization. If True provided, but the model was initialized with use_stabilized=False, then prevalence is calculated from data at hand, rather than the prevalence from the training data. See Also: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4351790/#S6title
- Returns:
- A matrix of size (num_subjects, num_treatments) with stabilized (if True)
weight for every individual and every treatment.
- Return type:
- set_fit_request(*, a='$UNCHANGED$')#
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.