causallib.estimation.PropensityFeatureStandardization#
- class PropensityFeatureStandardization(outcome_model, weight_model, outcome_covariates=None, weight_covariates=None, feature_type='weight_vector')[source]#
A doubly-robust estimator of the effect of treatment. This model adds the weighting (inverse probability weighting) as additional feature to the outcome model.
References
Bang and Robins, https://doi.org/10.1111/j.1541-0420.2005.00377.x
Kang and Schafer, section 3.3, https://dx.doi.org/10.1214/07-STS227
- Parameters:
outcome_model (
IndividualOutcomeEstimator) – A causal model that estimate on individuals levelweight_model (
WeightEstimator | PropensityEstimator) – A causal model for weighting individuals (e.g. IPW).outcome_covariates (
numpy.ndarray) – Covariates to use for outcome model. If None - all covariates passed will be used. Either list of column names or boolean mask.weight_covariates (
numpy.ndarray) – Covariates to use for weight model. If None - all covariates passed will be used. Either list of column names or boolean mask.feature_type (
str) –the type of covariate to add. One of the following options: * “weight_vector”: uses a signed weight vector. Only defined for binary treatment.
For example, if weight_model is IPW then: 1/Pr[A=a_i|X] for each sample i. As described in Bang and Robins (2005).
”signed_weight_vector”: as ‘weight_vector’, but negates the weights of the control group. For example, if weight_model is IPW then: 1/Pr[A|X] for treated and 1/Pr[A|X] for controls. As described in the correction for Bang and Robins (2008)
- ”weight_matrix”: uses the entire weight matrix.
- For example, if weight_model is IPW then: 1/Pr[A_i=a|X_i=x],
for all treatment values a and for every sample i.
”masked_weight_matrix”: uses the entire weight matrix, but masks it with a dummy-encoding of the treatment assignment. For example, if weight_model` is IPW then: 1/Pr[A=a_i|X=x_i] and 0 for all other a≠a_i columns. As described in Bang and Robins (2005).
- ”propensity_vector”: uses the probabilities for being in treatment group: Pr[A=1|X].
Better defined for binary treatment. Equivalent to Scharfstein, Rotnitzky, and Robins (1999) that use its inverse.
- ”logit_propensity_vector”: uses logit transformation of the propensity to treat Pr[A=1|X].
As described in Kang and Schafer (2007)
- ”propensity_matrix”: uses the probabilities for all treatment options,
Pr[A_i=a|X_i=x] for all treatment values a and samples i.
- __init__(outcome_model, weight_model, outcome_covariates=None, weight_covariates=None, feature_type='weight_vector')[source]#
A doubly-robust estimator of the effect of treatment. This model adds the weighting (inverse probability weighting) as additional feature to the outcome model.
References
Bang and Robins, https://doi.org/10.1111/j.1541-0420.2005.00377.x
Kang and Schafer, section 3.3, https://dx.doi.org/10.1214/07-STS227
- Parameters:
outcome_model (
IndividualOutcomeEstimator) – A causal model that estimate on individuals levelweight_model (
WeightEstimator | PropensityEstimator) – A causal model for weighting individuals (e.g. IPW).outcome_covariates (
numpy.ndarray) – Covariates to use for outcome model. If None - all covariates passed will be used. Either list of column names or boolean mask.weight_covariates (
numpy.ndarray) – Covariates to use for weight model. If None - all covariates passed will be used. Either list of column names or boolean mask.feature_type (
str) –the type of covariate to add. One of the following options: * “weight_vector”: uses a signed weight vector. Only defined for binary treatment.
For example, if weight_model is IPW then: 1/Pr[A=a_i|X] for each sample i. As described in Bang and Robins (2005).
”signed_weight_vector”: as ‘weight_vector’, but negates the weights of the control group. For example, if weight_model is IPW then: 1/Pr[A|X] for treated and 1/Pr[A|X] for controls. As described in the correction for Bang and Robins (2008)
- ”weight_matrix”: uses the entire weight matrix.
- For example, if weight_model is IPW then: 1/Pr[A_i=a|X_i=x],
for all treatment values a and for every sample i.
”masked_weight_matrix”: uses the entire weight matrix, but masks it with a dummy-encoding of the treatment assignment. For example, if weight_model` is IPW then: 1/Pr[A=a_i|X=x_i] and 0 for all other a≠a_i columns. As described in Bang and Robins (2005).
- ”propensity_vector”: uses the probabilities for being in treatment group: Pr[A=1|X].
Better defined for binary treatment. Equivalent to Scharfstein, Rotnitzky, and Robins (1999) that use its inverse.
- ”logit_propensity_vector”: uses logit transformation of the propensity to treat Pr[A=1|X].
As described in Kang and Schafer (2007)
- ”propensity_matrix”: uses the probabilities for all treatment options,
Pr[A_i=a|X_i=x] for all treatment values a and samples i.
- estimate_individual_outcome(X, a, treatment_values=None, predict_proba=None)[source]#
Estimates individual outcome under different treatment values (interventions)
- Parameters:
X (
pandas.DataFrame) – Covariate matrix of size (num_subjects, num_features).a (
pandas.Series) – Treatment assignment of size (num_subjects,).treatment_values (
Any) – Desired treatment value/s to use when estimating the counterfactual outcome/ If not supplied, calculates for all available treatment values.predict_proba (
bool | None) – In case the outcome task is classification and in case learner supports the operation, if True - prediction will utilize learner’s predict_proba or decision_function which returns a continuous matrix of size (n_samples, n_classes). If False - predict will be used and return value will be based on a vector of class classifications. If None - parameter is ignored and behaviour is as specified when initializing the IndividualOutcomeEstimator.
- Returns:
- DataFrame which columns are treatment values and rows are individuals: each column is a vector
size (num_samples,) that contains the estimated outcome for each individual under the treatment value in the corresponding key.
- Return type:
- fit(X, a, y, refit_weight_model=True, **kwargs)[source]#
Trains a causal model from observed data.
- Parameters:
X (
pandas.DataFrame) – Covariate matrix of size (num_subjects, num_features).a (
pandas.Series) – Treatment assignment of size (num_subjects,).y (
pandas.Series) – Observed outcome of size (num_subjects,).sample_weight – To be passed to the underlining scikit-learn’s fit method.
- Returns:
A causal weight model with an inner learner fitted.
- Return type:
IndividualOutcomeEstimator
- set_fit_request(*, a='$UNCHANGED$', refit_weight_model='$UNCHANGED$')#
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
- Returns:
self – The updated object.
- Return type: