Censored Weibull

PyMC censored requires access to the censored data (e.g. a list of values) and two absolute boundaries (upper and lower)

If you want to achieved this, it is easy with the Censored class. Lets say your data is

[10,10,20,180,180,1200]

If you supply

lower = [10, 10, 20, -np.inf, -np.inf, -np.inf] 
upper = [np.inf, np.inf, np.inf, np.inf, 180, 1200]

it will apply lower (left) censoring to 10,10,20 and upper (right) censoring to 180 and 1200.

Regarding the code you posted

pweibull(data$IncP_max,current$k,current$theta)

I assume this is the cumulative distribution function. If so, perhaps you do not really want censoring. Do you just have some measurements that you know falls within an interval (but do not know the exact measurement)? In that case difference between CDFs is indeed the right way to go and I would also do it via Potential terms. I have looked at the data provided by the paper and there are some instances where the lower is equal to upper so for those cases they might have used pdf for weibul (that is what I would have done atleast). So for instance something like this:

import pandas as pd
import numpy as np
import pymc as pm
import arviz as az
import pytensor.tensor as pt
from scipy.special import gammainc as gammai

np.random.seed(27)

def weibull_cdf(x, kappa, theta):
    return 1 - pt.exp(-(x/theta)**kappa)

def censored(name, kappa, theta, lower, upper):
    L = weibull_cdf(lower, kappa, theta)
    U = weibull_cdf(upper, kappa, theta)
    return U - L

data = pd.read_csv("./data_h7n9_severity.csv")

lower = data.IncP_min.values #incubation periods lower boundary
upper = data.IncP_max.values #incubation periods upper boundary

fatal = data.death_status.values

interval = np.array([[x,y,f] for x,y,f in zip(lower,upper,fatal) if x!=y])
exact = np.array([[x,f] for x,y,f in zip(lower, upper, fatal) if x==y])

with pm.Model() as mod:
    # kappa = pm.HalfNormal('kappa', 5)
    # theta = pm.HalfNormal('theta', 5)
    kappa = pm.Uniform('kappa', 0, 5, shape=2) #as used in the paper
    theta = pm.Uniform('theta', 0, 5, shape=2) #as used in the paper
    mu = pm.Deterministic('mu', theta*pt.gamma(1 + 1 / kappa)) #incubation period mean
    y = pm.Potential('y', censored('censored', kappa[interval[:,2]],
                                   theta[interval[:,2]], interval[:,0],
                                   interval[:,1]))
    pm.Weibull("exact", kappa[exact[:,1]], theta[exact[:,1]],
               observed=exact[:,0])


with mod:
    idata = pm.sample(1000, tune=1000)#, nuts_sampler='numpyro')#, random_seed=27)

az.summary(idata)

az.plot_posterior(idata)

from which I get

1 Like