causallib.survival.regression_curve_fitter module

class causallib.survival.regression_curve_fitter.RegressionCurveFitter(learner: sklearn.base.BaseEstimator)[source]

Bases: object

Default implementation of a parametric survival curve fitter with covariates (pooled regression). API follows ‘lifelines’ convention for regression models, see here for example: https://lifelines.readthedocs.io/en/latest/fitters/regression/CoxPHFitter.html#lifelines.fitters.coxph_fitter.CoxPHFitter.fit

Parameters

learner – scikit-learn estimator (needs to implement predict_proba) - compute parametric curve by fitting a time-varying hazards model that includes baseline covariates. Note that the model is fitted on a person-time table with all covariates, and might be computationally and memory expansive.

fit(df: pandas.core.frame.DataFrame, duration_col: str, event_col: Optional[str] = None, weights_col: Optional[str] = None)[source]

Fits a parametric curve with covariates.

Parameters
  • df (pd.DataFrame) – DataFrame, must contain a ‘duration_col’, and optional ‘event_col’ / ‘weights_col’. All other columns are treated as baseline covariates.

  • duration_col (str) – Name of column with subjects’ lifetimes (time-to-event)

  • event_col (Optional[str]) – Name of column with event type (outcome=1, censor=0). If unspecified, assumes that all events are ‘outcome’ (no censoring).

  • weights_col (Optional[str]) – Name of column with optional subject weights.

Returns

Self

predict_survival_function(X: Optional[Union[pandas.core.series.Series, pandas.core.frame.DataFrame]] = None, times: Optional[Union[List[float], numpy.ndarray, pandas.core.series.Series]] = None) pandas.core.frame.DataFrame[source]

Predicts survival function (table) for individuals, given their covariates. :param X: Subjects covariates :type X: pd.DataFrame / pd.Series :param times: An iterable of increasing time points to predict cumulative hazard at.

If unspecified, predict all observed time points in data.

Returns

Each column contains a survival curve for an individual, indexed by time-steps

Return type

pd.DataFrame