I actually did not need to do much beyond what you suggested here to get it working and it does work as intended. My only confusion was when you use observed=data in a random variable and if that data has repeats, how does it broadcast inside logp. I had to experiment with transpose and axis= argument a lot to get it working properly and I had to sort of do this manually because I couldn’t break and do pm.logp(rv, value.T).eval() inside the logp definition (it gives a bunch of errors). To my confusion, even if I change
logp.sum(axis=-1)
to
logp.sum(axis=0)
the model runs (with three clusters and two dimensions so there is no chance of mistaking the two dimensions together when broadcasting stuff I assume). So I wonder if there is anything else I should be doing to enforce shape safety in the code below (as I dont seem to understand its intricacies very well)? But apart from that, having dist and logp as below
def dist(mu, sigma, lower, upper, size=None):
mu = pt.as_tensor_variable(mu)
return\
pm.Censored.dist(pm.Normal.dist(mu, sigma, size=size),
lower=lower, upper=upper)
def logp(value, *params):
# Recreate the RV
rv = dist(*params)
# Request logp
logp = pm.logp(rv, value.T)
# Sum last axis
return logp.sum(axis=-1)
and doing the mixture as
#priors for mu, sigma here
components = [pm.CustomDist.dist(mu[i], sigma,
lower, upper, logp=logp,
dist=dist, ndim_supp=1)
for i in range(n_clusters)]
mix = pm.Mixture(f"mix", w=w, comp_dists=components,
observed=data.T, size=data.T.shape)
Overall this gives another nice, more or less automatic way to stick together univariate distributions together (not just censored normal obviously) with sup dim =1 so they sample better when mixing! I don’t know if it has uses beyond that though.