Question about example 'Bayesian survival analysis'

RickGr · September 26, 2025, 9:21am

Hi Everyone,

I’ve been playing around with PYMC, and I wanted to ask a beginner’s question, specific to an example on your website. If anyone is already familiar with the example Bayesian Survival Analysis — PyMC example gallery , then maybe someone can quickly help me with my question.

In the notebook, a variable called deaths codes for an observed death at a given time period, the are 0 for no death, and 1 for a death. Then, the death is approximated with a pm.Poisson("obs", mu, observed=death) random variable. Since the variable death is 0 or 1, to me it seems more logical to model deaths, using a pm.Bernoulli. If one assumes the same exponential rate mu, then the probability of death in the given exposure window would be given by p=1-e^{-mu}.

However, if I replace the line obs = pm.Poisson("obs", mu, observed=death) with obs = pm.Bernoulli("obs", p=1-T.exp(-mu), observed=death), the sampling leads to all divergences. Is there someone who knows why, or where I can find the information I need to solve this problem? Otherwise, if my approach does not work, is there someone who can explain to me, why a Poisson distribution should be used?

Kind regards, Rick Groeneweg

fonnesbeck · September 26, 2025, 2:10pm

Hi Rick,

Good questions. There is both a Poisson and binomial approximation to the Cox model. Because they are rare events (death) it turns out not to matter which you use here.

The reason the binomial is not working for you is that you have not constrained the probability to be on the unit interval. It will be a logit transformation, rather than a log.

RickGr · September 26, 2025, 2:40pm

Thanks for the answer; I appreciate it. Am I correct to say that 1-T.exp(-mu) is always going to be between 0 and 1 in the example? To me, it looks like mu has to be nonnegative. Is there a place, where I can read more about how I could do this; or why I have to use a logit transfo, rather than a log? I like the Poisson approximation, but I would not really be able to convince myself easily that death after mastectomy in patients with metastasized breast cancer is a rare event. Again, thanks for your answer!

fonnesbeck · September 26, 2025, 3:02pm

You are probably right in this case for the transformation, so would have to take a peek and see why it’s failing.

As far as rare events go, it’s rare in the sense that it’s happening either once or never, so a Poisson mean would be small. While binomial (Bernoulli) may seem more natural, the Cox survival model is directly related to Poisson regression through a clever restructuring of the survival data. So if you want to use a Bernoulli, it’s not just a matter of swapping it in for the Poisson. It will be a different structuring of the data.Its a proportional odds (rather than proportional hazards) model and is better suited for discrete time than continuous time.

RickGr · September 27, 2025, 4:15pm

Something to think about! It’s more complicated that I thought, then. I’m still curious to find out why it doesn’t work, since I don’t really get that yet. Thank you for your feedback

bob-carpenter · October 2, 2025, 5:53pm

Because sampling doesn’t require normalization, if you have a Poisson truncated to just 0 and 1 observations, it’s going to be fine for analysis (not for generation, so it’s suspect). This works out to

\Pr[Y = 1] \propto \textrm{poisson}(1 \mid \lambda) = \lambda \cdot e^{-\lambda}

and

\Pr[Y = 0] \propto \textrm{poisson}(0 \mid \lambda) = e^{-\lambda}.

Once you normalize due to only having two elements, it works out to

\Pr[Y = 1] = \frac{\lambda}{\lambda + 1}

and

\Pr[Y = 0] = \frac{1}{\lambda + 1}.

RickGr · October 5, 2025, 9:47am

Oh! That is simple and quite clever. So a pm.Poisson and a pm.Bernoulli are equilvant (up to an approximation) if all observations are 0 or 1. Thanks for that insight. Do you maybe also know why it didn’t work using pm.Bernoulli?

bob-carpenter · October 6, 2025, 2:41pm

BUGS and JAGS would use this to sneak in arbitrary log densities. If you write 0 ~ poisson(-lambda) then it adds lambda to the log density. Here’s a section of a PyMC tutorial exploring that idea using a survival model as an example.

brontidon · October 6, 2025, 9:12pm

You have to be careful here: in general they are not.

The Poisson approximation to the cox-ph model works because of the definition of the partial likelihood function. We can basically ‘add’ a bunch of Poisson terms that correspond to no event (in the augmented dataset where we create a bunch of psudeo observations for each individual and interval within our observations) and the Poisson process likelihood function (the means are a function of the time and hazard ratio within each interval and individual pair) becomes a very good discrete approximation to the cox ph model: Statistics and Population (note, this course assumes you’re not taking a bayesian approach, but the the logic follows in the derivation of the PL function)

For the bernoulli approximation; i’m not sure how it’s usually done outside of some non-parametric approaches that have a proportional odds term. For those approaches that do not, generally the time of observation is regressed on the covariate and then this latent variable is passed to a probit regression or whatever.

RickGr · October 13, 2025, 4:36pm

Thanks for the link, I appreciate the help and I’m going through the material If someone knows why the divergences appear, that would still be very informative to me.

Topic		Replies	Views
Problems sampling a Poisson single hit model	10	134	September 27, 2024
Time varying survival model (poisson regression) not porting to pymc v4 v5	10	848	August 16, 2022
Bernoulli samples all the same? Questions	1	2456	May 29, 2018
Simple Exponential Survival Function Questions	8	1215	October 21, 2024
Computing Bayes Factor with Bernoulli Distribution v5 modeling	11	167	June 3, 2025

Question about example 'Bayesian survival analysis'

Related topics