Hello all,

I am fairly new to pymc and probabilistic programming in general. I’ve been trying to use MCMC for the treatment of dataset with left-censored data and found this very nice example.

I’ve run some tests with the model and I get good results, but I have troubles extracting samples from the left censored posterior distribution. The example says that *the imputed censored model is capable of generating data sets of censored values (sample from the posteriors of left_censored and right_censored to generate them)*, however when sampling from the posteriors I get data from the entire distribution and not only the censored values. A minimal repro is below. What I am doing wrong?

I also have a more general question related to this topic: for a simple distribution like in this case, is there a reason for using the `sample_posterior_predictive`

function instead of simply drawing random samples from a newly defined distribution with the parameters (sigma and mu) obtained from the MCMC simulation?

Thanks a lot already,

best regards

## Minimal Repro

import numpy as np

import pymc as pm

import matplotlib.pyplot as plt

import seaborn as sns## Define input data

N_data = 150

mu_th = 1.5

sigma_th = 0.8

Q = 0.5## Generate random data

Y_data = np.random.normal(loc=mu_th, scale=sigma_th, size=N_data)

## Censor data

RL = np.quantile(Y_data, Q)

Y_obs = Y_data[Y_data>RL]

N_cen = len(Y_data[Y_data<=RL])

N_obs = len(Y_obs)with pm.Model() as imputed_censored_model:

mu = pm.Normal(“mu”, mu = np.mean(Y_obs), sigma = 1)

sigma = pm.LogNormal(“sigma”, mu = np.std(Y_obs), sigma = 1)`left_censored = pm.Normal("left_censored", mu,sigma, transform=pm.distributions.transforms.Interval(None, RL), shape=N_cen, initval=np.full(N_cen, RL - 1)) observed = pm.Normal("obs", mu=mu, sigma=sigma, observed=Y_obs, shape=N_obs) # Sampling trace = pm.sample(1000, return_inferencedata=True) # Posterior predictive posterior_predictive_censored = pm.sample_posterior_predictive(trace, var_names = ['left_censored']) posterior_predictive_obs = pm.sample_posterior_predictive(trace, var_names = ['obs'])`

## Extract values

ppc_val = posterior_predictive_censored.posterior_predictive[‘left_censored’].values.flatten()

ppo_val = posterior_predictive_obs.posterior_predictive[‘obs’].values.flatten()## Figure

fig, axs = plt.subplots()

sns.kdeplot(ppc_val, ax = axs, label = ‘left censored’)

sns.kdeplot(ppo_val, ax = axs, ls = ‘–’, label = ‘obs’)

axs.legend()