Calculating the likelihood based on not missing observed values


I am trying to deal with my missing data of the target values by using mask, and I want to calculate the likelihood just at the points where the target values are missing, how should I do?

My model looks like this.

 ann_input  = theano.shared(X_train)
 ann_output = theano.shared(Y_train)

 with pm.Model() as neural_network:
        weights_in_1 = pm.Normal('w_in_1',  0, sd = 1,   shape=(5,3),  testval=init_1)      
        weights_1_out = pm.Normal('w_1_out', 0, sd = 1,  shape=(3,),  testval=init_out)
        hidden1_bias = pm.Normal( 'hidden1_bias', sd=1, shape=n_hidden1)
        hidden_out_bias = pm.Normal('hidden_out_bias', mu = 3, sd=1)
        act_1 = pm.math.tanh(, weights_in_1 )+ hidden1_bias[None, :])
        regression =, weights_1_out) + hidden_out_bias    

        pm.Normal('out', mu =regression, sd=np.sqrt( 0.9 ), observed =  ann_output)

        train_trace = pm.sample( )

In Likelihood function ann_output has some values masked, and part of it looks like this


Based on the example from disaster case study, I changed the value to -999 for mask, then I got the result

So I wonder if the data was really masked, I changed to 999, and mask the value 999, then I got the result.


I am not sure I understand what you are trying to do. But in general you dont have a point value as likelihood for missing data - they are represented as some kind of density and if you sample then the MCMC samples of the missing data are from this said density.


I concur with @junpenglao: In Bayesian analysis you don’t have to give missing value a special treatment. To compute the likelihood you only have to care about those values that are observed.


thanks for the reply. It seems that when the variable is assigned to theano.shared, then the likelihoof function does not see the mask anymore. After we tried without theano.shared, feeding input directly into the model, then it works well, like @junpenglao mentioned. :slight_smile:


I’m facing a very similar problem atm where I have a masked array in a pm.Minibatch in a theano.shared var. This could explain wrong estimates I am facing. Is there an issue for this open somewhere?

Where and how is the missing observed treatment for numpy masked arrays implemented? I would like to look into it to understand if it is my problem.


I am not sure this would work - at least it is not one of the cases that we tested I think.

Internally, PyMC3 search for the masked value in the observed, and create a free random variable of the masked values. In effect it is adding a new random variable and do prior predictive sample from it: