Simple Exponential Survival Function

I’m having trouble replicating survival analysis in PyMC3. I’ve been following along with the Statistical Rethinking lectures, and there’s an example about survival analysis on cat adoptions that isn’t in the published book. I’m wondering how to convert it into a PyMC3 model, particularly this sort of logic:

Screen Shot 2020-04-28 at 11.46.29 AM

If there is an instance of adoption it isn’t censored data so I can use a regular exponential distribution, and if it isn’t labeled as an adoption it is censored data because the cat might still end up being adopted so I’d use the exponential-CCDF. How do I implement this sort of logic in PyMC3?

Here’s stan code for the model as well:

model{
vector[22356] lambda; 
b ~ normal( 0 , 1 ); 
a ~ normal( 0 , 1 ); 
for ( i in 1:22356 ) {
        lambda[i] = a + b * black[i];
        lambda[i] = exp(lambda[i]);
    }

for ( i in 1:22356 )
    if ( adopted[i] == 0 ) target += exponential_lccdf(days_to_event[i] | lambda[i]);
for ( i in 1:22356 )
    if ( adopted[i] == 1 ) days_to_event[i] ~ exponential( lambda[i] );
}

You can use a potential for that. Note that we have logcdf implemented for pm.Exponential but not the survival function (i.e., lccdf), so the best way I can think of is:

with pm.Model():
    a = pm.Normal('a', 0, 1)
    b = pm.Normal('b', 0, 1)
    log_rate = a + b * black
    y = pm.Exponential('observed', tt.exp(log_rate), observed=days_to_event)
    pm.Potential('survival', -log_rate[adopted == 0])  ==> exp_lccdf = exp_log_prob(lambda, rate) - log_rate

I couldn’t end up following your solution as well as I had hoped, part of it is being new to PyMC3 and part of it is needing to work on my theoretical knowledge more. I did however find this exponential survival function from PyMC3s website (which I just managed to derive myself after and figure out how they got it) Screen Shot 2020-04-29 at 9.37.46 AM
and a corresponding function

def exponential_logp(censored, value):
    return (censored * tt.log(lam) - value*lam).sum()

So the model ends up being

with pm.Model() as m1:
    a_c = pm.Normal('a_color', 0, 1, shape=len(df.color_id.unique()))
    mu = tt.exp(a_c[cid])
    lam = 1/mu
    exp_surv = pm.DensityDist('exp_surv', exponential_logp, observed={'censored':censored, 'value':days_to_event})
    
    m1_trace = pm.sample(draws=1000)

And this model is returning the exact same inference as the Stan model. However, when I try to to run
pm.sample_posterior_predictive(trace_m1)
I get the following error:
ValueError: Distribution was not passed any random method Define a custom random method and pass it as kwarg random

Are there any good resources on how to fix this issue and create a custom distribution from my exponential_logp function? I attempted to troubleshoot the issue but I haven’t been getting very far

edit: with a little more effort I ended up following along with your code, which unlike with my approach, allows for sampling. Thanks!

@john_c I have been trying to solve the same problem. Did you end up using Potential? If so, How did you incorporated your model? Seems like you are missing priors for exponential_logp !! Thanks