Bayesian Survival Function

sachinruk · April 18, 2019, 1:25am

This is a crosspost: https://stats.stackexchange.com/questions/403714/bayesian-survival-function

I have managed to estimate the posterior of the latent variables of my model which can be stated as follows (adapted from https://docs.pymc.io/notebooks/bayes_param_survival_pymc3.html):

y_i = \beta^T x_i + \xi_i

where y_i\equiv \log T_i, is the log duration and,

\xi_i \sim Gumbel(0, s) \\ s \sim HalfNormal(5)

Note that this can be equivalently stated that y_i\sim Gumbel(\beta^T x_i, s).

Therefore, the likelihood of the dataset for both censored and uncensored data is:

\log p\left(\cup_{i=1}^N y_i|\beta, s\right) = \sum_{i \in uncensored} \log p\left(y_i=\log T_i|\beta, s\right) + \sum_{i \in censored} \log p\left(y_i>\log T_i|\beta, s\right)

The question is regarding getting a Bayesian estimate of the survival function. The author of the above notebook, samples from the posterior predictive distribution of y using:

X_.set_value(np.array(x_test)[None,:])
cens_.set_value(np.array([False]))
Y_.set_value(np.zeros(1))
with weibull_model:
    post_trace = pm.sample_posterior_predictive(trace, vars=[y_obs])

and uses the samples of y (converted back to time domain) to generate the survival function with the following logic (for time steps 1 to 200):

for t in range(1, 201):
    frac = sum(t_samples > t) / len(t_samples)
    survival_func[t] = frac

It makes logical sense to state that the fraction of samples greater than a particular time is the survival rate. However, this is a point estimate of the survival function.

There’s two ways I can think of getting a ‘Bayesian’ estimate of the Survival Function:

Sample N number of traces (with say 1000 points) of the posterior predictive distribution and do:

for n in range(N):
    for t in range(1, 201):
        frac = sum(t_samples[n] > t) / len(t_samples[n])
        survival_func[N, t] = frac

We know the CDF of the gumbel distribution given the latent variables to be, \exp\left(-\exp\left(\left(\log T_i - X_i^T\beta\right)/s\right)\right). Hence we can let the j-th sample for the survival function for a given value of T to be 1 - \exp\left(-\exp\left(\left(\log T_i - X_i^T\beta^{(j)}\right)/s^{(j)}\right)\right) where \beta^{(j)}, s^{(j)} are the samples from the posterior of the latent variables.

Are the two methods equivalent? If not which method is the more correct way of getting a Bayesian sample of the Survival Function.

Edit

I feel like I should point out after @Gon_F’s answer that in method 2. What I am calculating is the expectation of \exp\left(-\exp\left(\left(\log T - X^T\beta\right)/s\right)\right) under the distribution p(\beta, s| \mathcal{D}_{train}) where \mathcal{D} is the training dataset. Whereas, the first method is looking directly at the distribution, p(T| \mathcal{D}_{train}) where, \beta, s has already been marginalised out. So I’m wondering if (1) is an approximation to (2) after all. Hopefully I’m not complicating things more.

chartl · April 18, 2019, 1:51pm

I love this question.

The way it is written, (1) is a Monte Carlo approximation to (2). In order to sample from the density in (2) you would:

(i) Sample \beta^{(j)}, s^{(j)} from P_\mathrm{post}(\beta, s)

(ii) Sample y_i^{(j)} from \mathrm{Gum}(x_i^T\beta^{(j)}, s)

(iii) Take T_i^{(j)} = \exp(y_i^{(j)})

This is precisely the content of the trace for y_i.

Gon_F · April 18, 2019, 6:30pm

If we do find an effective way to do Survival functions, it would open a path to implementing joint models, like Stan has, which allow modelling evolving Time-to-Event’s (TTE’s) over time, which I had been looking at recently.

lucianopaz · April 20, 2019, 8:07pm

Maybe @AustinRochford, can comment on this

chartl · April 23, 2019, 10:31pm

To be clear set \mathcal{D}_{train} = (X, y); and note that y only influences the predictive posterior p(T|X, y) through its influence on the posteriors p(\beta, x | X, y):

p(T|X, y) = \int_{\beta, s} p(T|\beta, s, X)p(\beta, s|X, y)d\beta ds

or:

p(T|D) = \int_{\beta ,s} p(T|\beta, s, D)p(\beta, s|D)d\beta ds

So sampling from p(\beta, s|D) and then from p(T|\beta, s) and computing computing \frac{1}{K}\sum_{i=1}^kT_i or \frac{\sum_{i=1}^K T_i P(T_i|\beta_i,s_iX)p(\beta_i, s_i | D)}{\sum p(\beta_i, s_i|D)} is a monte-carlo approximation to the (numerical) expectation \int T p(T|D) d T

Topic		Replies	Views
How to use pymc3 for survival analysis? Questions	6	2017	August 5, 2021
Use of pm.Potential in Survival Analysis Example version agnostic modeling	2	586	March 31, 2022
Plotting survival curve with a Bayesian inference? v5	2	272	August 22, 2023
Weibull Survival Regression AFT Questions	4	1551	April 20, 2018
Sample posterior predictive with censored data Questions	3	520	April 7, 2023

Bayesian Survival Function

Edit

Related topics