Hello all,
I am attempting to re-create JAG models from an existing R project into PyMC. However, I am hitting a snag with censored data imputation. (Paper here if bored)
I need to be able to impute primarily left, and occasionally interval and right censored data within a small dataset that may have different censoring points per measurement. For example, results may come from different laboratories with different Limits of Quantification (LOQ).
The model is relatively simple. For example
# Example of data with left, right and interval censoring
data = ['28.9', '<19.4', '<5.5', '>149.9', '26.42', '56.1', '[25.6-50.2]']
I can follow the example for censored data here, however only for the non-imputated section. I found that you can pass an array for lower/upper for pm.Censored, but I can figure out how to get it working for pm.distributions.transforms.Interval (or whatever else I’ll need to do).
Example of Data Pre-Processing
# Censored Data Processing
@dataclass
class CensoredData:
y: List[float]
lower_limits: List[float]
upper_limits: List[float]
def define_observed_lower_upper(analysis: List[str]) -> CensoredData:
y = []
lower_limits = []
upper_limits = []
for value in expo_data:
if value.startswith('<'):
# Left-censored
limit = float(value[1:]) # Extract the numeric value
y.append(limit)
lower_limits.append(limit)
upper_limits.append(np.inf)
elif value.startswith('>'):
# Right-censored
limit = float(value[1:]) # Extract the numeric value
y.append(limit)
lower_limits.append(-np.inf)
upper_limits.append(limit)
elif '-' in value:
# Interval-censored
#TODO: Test once figured out imputation
lower, upper = map(float, value.split('-')) # Extract the range
y.append((lower + upper) / 2)
lower_limits.append(lower)
upper_limits.append(upper)
else:
# Uncensored
y.append(float(value))
lower_limits.append(-np.inf)
upper_limits.append(np.inf)
return CensoredData(y, lower_limits, upper_limits)
test_data = ['25.7', '17.1', '168', '85.3', '66.4', '<49.8', '33.2', '<24.4', '38.3']
data = define_observed_lower_upper(test_data)
Example of Left Censored Model without Imputation
# Bayesian model with pm.Censored
with pm.Model() as model:
# Priors
mu = pm.Uniform("mu", lower=-20, upper=20)
log_sigma = pm.Normal("log_sigma", mu=-0.1744, sigma= 2.5523)
sigma = pm.Deterministic("sigma", pm.math.exp(log_sigma))
precision = pm.Deterministic("precision", 1 / (sigma ** 2))
# Likelihood with censoring
censored_observations = pm.Censored(
"y",
pm.Lognormal.dist(mu=mu, tau=precision),
lower= data.lower_limits,
upper=data.upper_limits,
observed=data.y,
)
# Sampling
trace = pm.sample(
draws=25000,
tune=5000,
chains=4,
return_inferencedata=True,
target_accept=0.90,
initvals={
"mu": np.log(0.3),
"log_sigma": np.log(2.5)
}
)
This is (one of) the JAGS models I am converting
model {
##prior on mu
mu ~dunif(",mu.lower,",",mu.upper,")
##prior on sigma
precision <-1/(sigma*sigma)
sigma <-exp(log.sigma)
log.sigma ~dnorm(",log.sigma.mu,",",log.sigma.prec,")
###likelihood
for (i in 1:n.obs) {
y[i] ~ dlnorm(mu, precision)
CensorType[i] ~ dinterval( y[i] , censorLimitMat[i,] )
}