Multivariate model with censored data

I have multivariate censored (clipped) data, which I am currently modeling using

y = pm.MvNormal(name="$Y$", mu=mu, cov=Sigma, observed=items, shape=items.shape)

From this example it seems like one simply has to add two terms to the log likelihood.

  1. Number of samples at the lower bound multiplied by the log of MVNormal CDF.
  2. Number of samples at the upper bound multiplied by the log of MVNormal CCDF.

Is this correct?

If so, how can I implement this efficiently in PyMC3? Is there are way of drawing the logcdf straight from ‘y’?

IIUC there is no close form expression of MvN CDF: https://en.wikipedia.org/wiki/Multivariate_normal_distribution#Cumulative_distribution_function

Well, so much for that project. Thanks!

High dimension stuff is difficult :sweat_smile:

1 Like