Estimate log-normal survival

I’m trying to set up a log-normal survival analysis. The log likelihood should be in the form:

llike=v*log(f(t))+(1-v)*S(t)

Where v is the event flag, f(t) the pdf, and S(t) the survival function.

When I look at the SAS documentation they show the formulas as follows:

image

Where w=log(t)

I don’t know how to make the cumulative normal distribution in the survival function. I am looking for suggestions on how to proceed. Scipy has a CDF for the normal distributions, but I don’t know now to get it into the pymc graph. I am also looking into making my own normal CDF using numerical recipes.

https://www.pymc.io/projects/docs/en/latest/api/generated/pymc.logcdf.html

1 Like

If I use the formula I would need to use the exponential formula, create the survival distribution, and then take the log. That might work.

def normal_logS(value, mu, sigma):
    return pm.math.log(1-pm.math.exp(pm.logcdf(pm.Normal.dist(mu, sigma), value)))

The code below worked. Thanks for the suggestion. On to gompertz.

(event is for event rather than censored, T is logged Time)

def normal_logS(value, mu, sigma):
    return pm.math.log(1- pm.math.exp(pm.logcdf(pm.Normal.dist(mu, sigma), value)))

from pymc.math import log,exp,sigmoid
with pm.Model() as model_1:
    beta0=pm.Normal("beta0",mu=0,sigma=100)
    beta1=pm.Normal("beta1",mu=0,sigma=100)
    alpha=pm.Gamma("alpha",0.001,0.001);
    
    u=beta0+beta1*trt
    sigma=pm.Deterministic("sigma",1/alpha)

    llike=event*(pm.logp(pm.Normal.dist(mu=u,sigma=sigma),T))+(1-event)*normal_logS(T,u,sigma)    

    y = pm.Potential("y",llike)
    data = pm.sample(draws=10000,tune=5000,nuts_sampler="nutpie",cores=20,se)

Have you checked if you can use pm.Censored for your application?

https://www.pymc.io/projects/docs/en/latest/api/distributions/censored.html

I have not tried it. For it to work the lower and upper founds would have to change based on the patient, or by rows in the dataset. If they can do this, I could construct the lower and upper based on my times and knowledge of if the patient had the event or not to make lower and upper bounds that would force the use of either the probability density or the survival distribution.

You should be able to pass in arrays of upper and lower to pm.Censored.

Thanks. I will try it. I will be a nice clean solution if it works.

I’m trying to replicate a model from a paper that estimated multiple survival endpoints, along with a cure fraction. My first step was to just be able to estimate a standard set of survival distributions. So far, I have: Exponential, Weibull, Log-Normal, Log-Logistic, and Gompertz. These were the distributions used in the paper. I will see if I can use the pm.Censored function. So far I’m using the formula: elog(f) + (1-e)log(S) (e is evented), This simplifies to: elog(h) + S. If pm.Censored does this then I should get the same answer and with cleaner code.

To have mixed cases you have to set lower and upper per subject. For a vector of 3 observations, the first lower censored, the second uncensored, and the third right censored it would look something like:

observed = [2, 0.5, 3]
pm.Censored(..., lower=[2, -np.inf, -np.inf], upper=[np.inf, np.inf, 3], observed=observed)

The censoring is indicated by the observation matching one of the bounds.

Thank you. I will try it. I will be great if it works.

I looks like it worked, and the code seems cleaner than working out the algebra for the log likelihood function:

Note: I still need to make sure that my linear predictor and priors are right…

Since I took the log of Time, I am doing my log normal case: The plots look pretty good.

### Data prep:

T=np.round(np.log(rfs.time.to_numpy()),6)
event=rfs.event.to_numpy(dtype=np.int32)

lower=np.empty(shape=T.shape,dtype=np.float64)
upper=np.empty(shape=T.shape,dtype=np.float64);

for i in range(T.shape[0]):
    lower[i]=-np.inf
    if event[i]==1:
        upper[i]=np.inf
    else:
        upper[i]=T[i]

### Model

with pm.Model() as model:
    beta0=pm.Normal("beta0",mu=0,sigma=100)
    beta1=pm.Normal("beta1",mu=0,sigma=100)
    alpha=pm.Gamma("alpha",0.001,0.001);
    
    u=beta0+beta1*trt
    sigma=pm.Deterministic("sigma",1/alpha)

    normal_dist = pm.Normal.dist(mu=u, sigma=sigma)
    censored_normal = pm.Censored("censored_normal", normal_dist, lower=lower, upper=upper,observed=T)
    
    data = pm.sample(draws=10000,tune=5000,nuts_sampler="nutpie",cores=20)
1 Like

Below is a link two the paper that I want to replicate the results:

A Bayesian hierarchical mixture cure modelling framework to utilize multiple survival datasets for long-term survivorship estimates: A case study from previously untreated metastatic melanoma (arxiv.org)

After being estimate survival curves I need to implement two new items. The first is a cure fraction, which models cases where after some point the curve flattens out, and the flat part is cured part. After that I need to be able to have two endpoints in single model (e.g. progression free survival and overall survival).

2 Likes