Hello,
I’m trying to compute the (posterior) probability of the latent variable of a Gaussian mixture model over a new sample, in order to get a “soft clustering” for new samples.
I generate fake data with:
ndata = 500
k = 3 # number of mixture components
mus = np.array([-11, 0, 7])
zs = np.random.randint(0, k, ndata)
y_obs = mus[zs] + np.random.normal(size=ndata)
Following the example notebook I define the model:
with pm.Model() as model:
p = pm.Dirichlet("p", a=np.ones(k), shape=k)
z = pm.Categorical("z", p=p, shape=ndata)
μ = pm.Normal("μ", mu=[-1, 0, 1], sigma=15, shape=k)
# impose μ[0] < μ[1] < μ[2]
order_means_potential = pm.Potential(
"order_means_potential",
tt.switch(μ[0] > μ[1], -np.inf, 0) + tt.switch(μ[1] > μ[2], -np.inf, 0)
)
σ = pm.Uniform("σ", lower=0, upper=20)
y = pm.Normal("y", mu=μ[z], sigma=σ, observed=y_obs)
Then I sample from the posterior and from the posterior predictive:
with model:
trace = pm.sample(1000)
posterior_pred_trace = pm.sample_posterior_predictive(trace)
I compare the posterior predictive with the observations:
plt.hist(y_obs, bins=40, density=True)
plt.hist(posterior_pred_trace['y'].reshape(-1), bins=40, density=True);
Now I would like to compute the probability of z for a new y_\mathrm{new}, in other words p(z = k| y_\mathrm{new}, \mathcal Y_\mathrm{obs}). I get that:
p(z=k|y_\mathrm{new},\mathcal Y_\mathrm{obs}) = \frac{p(y_\mathrm{new}, z=k|\mathcal Y_\mathrm{obs})}{\sum_k p(y_\mathrm{new}, z=k|\mathcal Y_\mathrm{obs})} = \frac{p(y_\mathrm{new}|z=k,\mathcal Y_\mathrm{obs})p(z=k|\mathcal Y_\mathrm{obs})}{\sum_k p(y_\mathrm{new}|z=k,\mathcal Y_\mathrm{obs})p(z=k|\mathcal Y_\mathrm{obs})}
but it is not clear to me how to proceed and estimate this value. Can this be estimated using the already computed posterior trace, or must I estimate the density (MAP or VI) first? Thank you!