Masking missing values of predictors

Dear PyMC3 Community,

I am looking for someone that worked with missing data for both predictors (Xi) and y_obs.
My understanding is that there is no need to do the imputation beforehand, e.g. as part of a preprocessing data analysis pipeline. Hence, this can be written in a Bayesian way directly.
I would like to model a Bernoulli classification based on X1 and X2 that contain missing values.
If I excluded the missing value, I could run the model but if I want to keep the missing values, I get into bugs.

I would highly appreciate any advice in regards to this.

Please find below the script.

Thank you very much in advance

x_missing = np.isnan(x_train)
X_train =, mask=x_missing)
#y_train.shape is (97, 1)
#X_train.shape is (97, 2)
X_shape = len(x_missing)
with pm.Model() as model:
  #Define priors
  beta = pm.Normal ('beta', 0, 10) 

  #Imputation of X missing values
  Xmu = pm.Normal('Xmu', 0, 1, shape=X_shape)
  X_modeled = pm.Normal('X', mu=Xmu, sd=10, observed=X_train)

  #Define likelihood
  lp = pm.Deterministic('lp',, beta))

  #Define posterior
  y_obs = pm.Bernoulli('y_obs', p=lp, observed=y_train)      

  trace = pm.sample()

You will need to make sure Xmu is broadcastable to X_train
for example:

Xmu = pm.Normal('Xmu', 0, 1, shape=(X_shape, 1))


Xmu = pm.Normal('Xmu', 0, 1, shape=(X_shape, 2))

Hi Junpeng,

Thank you very much for your message.

I also had this thought that the issue might be related to the shape, but I still get bad initial energy when running the Model().

The X matrix has X1 and X2, both with different numbers of missing values. Do you think I’m missing something somewhere in this regards?

Also, even though it might sound stupid, should the x_missing contain the actual missing values or the boolean numpy transformation which to be masked as the latent variable?

the generated x_missing will contain fill-in value for the masked latent variable. The bad initial energy problem you can search for some suggested solution on the discourse.