I want to take two censored univariate normals and combine them into one multivariate distribution of support dimension 1 (I will give more context about this at the end of my message for anyone interested). My attempt at getting this was naively trying:
def dist(mu, cov, lower, upper, size=None):
mu = pt.as_tensor_variable(mu)
cov = quaddist_matrix(cov, chol=None, tau=None, lower=None)
dist1 = pm.Normal.dist(mu[0], cov[0,0], size=size)
dist2 = pm.Normal.dist(mu[1], cov[1,1], size=size)
censored_rv = pt.stack([pt.clip(dist1, lower, upper),
pt.clip(dist2, lower, upper)])
return censored_rv
with pm.Model() as model:
sigma = 1
mu = [-5,-5]
custom_dist1 = pm.CustomDist("test", mu, sigma*np.eye(2),
lower, upper, dist=dist, ndim_supp=1,
ndims_params=[0, 1])
value = pt.vector("value")
rv_logp = pm.logp(custom_dist1, value)
print("custom dist1 logP(-5, -5):" + str(rv_logp.eval({value: [-5, -5]})))
print(custom_dist1.owner.op.ndim_supp)
naively hoping that when I supply ndim_supp=1, everything would be magically sorted for me! However when you evalute the logp it returns two values indicating that what I am creating is of batch dimension 1 and not support dimension 1 (when you also use two of such distributions in a mixture model, their components mix). However custom_dist1.owner.op.ndim_supp is actually 1! So ndim_supp=1 does not seem to achieve much beyond setting this value to 1 but not actually changing the structure of the distribution? Will stack always create batch dimensions no matter what you do? Is there a way around this, by maybe trying logp? I would have to manually implement censoring with erf since I have to implement it in either the definition of dist or logp because otherwise I assume pymc will throw a Censoring not defined for multivariate distributions error if I use pm.Censored.
Context:
I want to create a mixture of m sets of n normals which I also want to censor. However if your set of n normals are created as a batch dimension with univariate normals, their coordinates mix with other set of normals’ coordinates in the mixture because the likelihood you get is symmetric with respect to batch dimensions. On the other hand if I use MvNormal to break this symmetry, then I can not censor it. @ricardoV94 has actually come up with a nice solution here which uses automatic marginalization:
However currently due to a bug in automatic marginalization this model can not sample with nuts (so no numpyro gpu for instance) and it will likely never be able to sample from data that contains unobserved value. Some of my important test data are some retrospective large data sets where missing data are abundant and there is no way to go back and fix this so it would be great if I can come up with a method that can actually deal with non-observed data and I can use my gpu!.
In general, I feel like it would be useful to have a simple way to convert a batch of n univariate distributions into a multivariate distribution with core dimension 1 (at least one important example would be non-gaussian mixture models of higher dimensions where components come from univariate distributions). Is this highly non-trivial? Would it be possible to define an analogue of stack which create a new core dimension and stacks along that?