Estimate log-normal survival

amoynahan · May 27, 2024, 3:28pm

I’m trying to set up a log-normal survival analysis. The log likelihood should be in the form:

llike=v*log(f(t))+(1-v)*S(t)

Where v is the event flag, f(t) the pdf, and S(t) the survival function.

When I look at the SAS documentation they show the formulas as follows:

Where w=log(t)

I don’t know how to make the cumulative normal distribution in the survival function. I am looking for suggestions on how to proceed. Scipy has a CDF for the normal distributions, but I don’t know now to get it into the pymc graph. I am also looking into making my own normal CDF using numerical recipes.

ricardoV94 · May 27, 2024, 4:40pm

https://www.pymc.io/projects/docs/en/latest/api/generated/pymc.logcdf.html

amoynahan · May 27, 2024, 4:58pm

If I use the formula I would need to use the exponential formula, create the survival distribution, and then take the log. That might work.

def normal_logS(value, mu, sigma):
    return pm.math.log(1-pm.math.exp(pm.logcdf(pm.Normal.dist(mu, sigma), value)))

amoynahan · May 27, 2024, 6:17pm

The code below worked. Thanks for the suggestion. On to gompertz.

(event is for event rather than censored, T is logged Time)

def normal_logS(value, mu, sigma):
    return pm.math.log(1- pm.math.exp(pm.logcdf(pm.Normal.dist(mu, sigma), value)))

from pymc.math import log,exp,sigmoid
with pm.Model() as model_1:
    beta0=pm.Normal("beta0",mu=0,sigma=100)
    beta1=pm.Normal("beta1",mu=0,sigma=100)
    alpha=pm.Gamma("alpha",0.001,0.001);
    
    u=beta0+beta1*trt
    sigma=pm.Deterministic("sigma",1/alpha)

    llike=event*(pm.logp(pm.Normal.dist(mu=u,sigma=sigma),T))+(1-event)*normal_logS(T,u,sigma)    

    y = pm.Potential("y",llike)
    data = pm.sample(draws=10000,tune=5000,nuts_sampler="nutpie",cores=20,se)

ricardoV94 · May 27, 2024, 6:42pm

Have you checked if you can use pm.Censored for your application?

https://www.pymc.io/projects/docs/en/latest/api/distributions/censored.html

amoynahan · May 28, 2024, 2:36pm

I have not tried it. For it to work the lower and upper founds would have to change based on the patient, or by rows in the dataset. If they can do this, I could construct the lower and upper based on my times and knowledge of if the patient had the event or not to make lower and upper bounds that would force the use of either the probability density or the survival distribution.

cluhmann · May 28, 2024, 2:38pm

You should be able to pass in arrays of upper and lower to pm.Censored.

amoynahan · May 28, 2024, 2:47pm

Thanks. I will try it. I will be a nice clean solution if it works.

I’m trying to replicate a model from a paper that estimated multiple survival endpoints, along with a cure fraction. My first step was to just be able to estimate a standard set of survival distributions. So far, I have: Exponential, Weibull, Log-Normal, Log-Logistic, and Gompertz. These were the distributions used in the paper. I will see if I can use the pm.Censored function. So far I’m using the formula: elog(f) + (1-e)log(S) (e is evented), This simplifies to: elog(h) + S. If pm.Censored does this then I should get the same answer and with cleaner code.

ricardoV94 · May 28, 2024, 3:15pm

To have mixed cases you have to set lower and upper per subject. For a vector of 3 observations, the first lower censored, the second uncensored, and the third right censored it would look something like:

observed = [2, 0.5, 3]
pm.Censored(..., lower=[2, -np.inf, -np.inf], upper=[np.inf, np.inf, 3], observed=observed)

The censoring is indicated by the observation matching one of the bounds.

Aaron_Moynahan · May 28, 2024, 4:21pm

Thank you. I will try it. I will be great if it works.

amoynahan · May 28, 2024, 5:03pm

I looks like it worked, and the code seems cleaner than working out the algebra for the log likelihood function:

Note: I still need to make sure that my linear predictor and priors are right…

Since I took the log of Time, I am doing my log normal case: The plots look pretty good.

### Data prep:

T=np.round(np.log(rfs.time.to_numpy()),6)
event=rfs.event.to_numpy(dtype=np.int32)

lower=np.empty(shape=T.shape,dtype=np.float64)
upper=np.empty(shape=T.shape,dtype=np.float64);

for i in range(T.shape[0]):
    lower[i]=-np.inf
    if event[i]==1:
        upper[i]=np.inf
    else:
        upper[i]=T[i]

### Model

with pm.Model() as model:
    beta0=pm.Normal("beta0",mu=0,sigma=100)
    beta1=pm.Normal("beta1",mu=0,sigma=100)
    alpha=pm.Gamma("alpha",0.001,0.001);
    
    u=beta0+beta1*trt
    sigma=pm.Deterministic("sigma",1/alpha)

    normal_dist = pm.Normal.dist(mu=u, sigma=sigma)
    censored_normal = pm.Censored("censored_normal", normal_dist, lower=lower, upper=upper,observed=T)
    
    data = pm.sample(draws=10000,tune=5000,nuts_sampler="nutpie",cores=20)

amoynahan · May 28, 2024, 5:29pm

Below is a link two the paper that I want to replicate the results:

A Bayesian hierarchical mixture cure modelling framework to utilize multiple survival datasets for long-term survivorship estimates: A case study from previously untreated metastatic melanoma (arxiv.org)

After being estimate survival curves I need to implement two new items. The first is a cure fraction, which models cases where after some point the curve flattens out, and the flat part is cured part. After that I need to be able to have two endpoints in single model (e.g. progression free survival and overall survival).

Topic		Replies	Views
Estimating Two survivals curves at the same time with a cure fraction	0	65	June 8, 2024
Simple Exponential Survival Function Questions	8	1117	October 21, 2024
How to formulate a Lognormal likelihood using PyMC3? Questions	19	2350	March 10, 2021
How to use pymc3 for survival analysis? Questions	6	2025	August 5, 2021
Censoring a truncated distribution v5	7	391	November 22, 2023

Estimate log-normal survival

Related topics