My problem is in this line: how should I define a batch of K Wishart distribution with d\times d scale matrix?
taus = pm.Wishart("taus", nu=d, V=S_inv, shape=(K,))
Detailed description:
I’m currently constructing a Dirichlet Process Mixture Model with multivariate normal likelihood and truncation-based definition. To be specific, this is a mixture of K components, each related with a mean parameter mu and inverse covariance matrix tau, and a normal distribution N(mu, tau^{-1}). The data include N observations, with d features.
In my construction of the model, I want each mixture component to have its own distinct mu’s and tau’s. Therefore, I have to specify explicity the batchsize, i.e. the number of components, in the definition of the variable, like this:
mus = pm.MvNormal("mu", tbar, cov=np.eye(d)/0.1, shape=(K,d))
However, this becomes a little confusing when it comes to the definition of covariance/precision matrix. I’ve tried several ways to specify the shape parameter, but they all lead me to some error. (I’ve tried shape=(K,)
, which i think will be simply ignored by the program; and also shape=(K,d)
, shape=(K,d,d)
, which all raised some error.) I also tried to change the shape of the parameter nu
to include the batch information implicitly, which also raised an error. How should I specify this batchsize with matrix-valued random variables like this?
As i tried visualizing the model, which is defined by the chunk below, I found it has observations of dimensionality N\times N, which is not expected. (I’m expecting N\times d). Here is my code:
with pm.Model(coords={"component": np.arange(K), "obs_id": np.arange(N), "feature": np.arange(d)}) as model:
alpha = pm.Gamma("alpha", 1.0, 1.0)
w = pm.StickBreakingWeights("w", alpha, K-1)
mus = pm.MvNormal("mu", tbar, cov=np.eye(d)/0.1, shape=(K,d))
taus = pm.Wishart("taus", nu=d, V=S_inv, shape=(K,))
category = pm.Categorical("category", p=w, shape=N)
obs = pm.MvNormal("obs", mu=mus[category], tau=taus[category], observed=df.values)
I’ve been testing the model with a toy example, with 2 real components of 3-dimensional Gaussian distribution
N = 100
K = 30
d = 3
mu1 = -np.ones(d)
mu2 = np.ones(d)
s1 = np.diag([1, 1, 0.01])
s2 = np.diag([1, 0.01, 1])
t1 = np.random.multivariate_normal(mean=mu1, cov=s1, size=(50,))
t2 = np.random.multivariate_normal(mean=mu2, cov=s2, size=(50,))
t = np.concatenate((t1, t2))
df = pd.DataFrame({
"v1":t[:,0],
"v2":t[:,1],
"v3":t[:,2]
})
# preprocessing
tbar = np.mean(t, axis=0)
D_center = t-tbar
S = np.matmul(D_center.transpose(), D_center) / N
S_inv = np.linalg.inv(S)
Besides, I’ve been wondering whether it’s possible to collapse on the category
variable, which is discrete and making it hard for the program to perform variational inference. Will using Mixture
in the model help? (However this is not my major concern right now lol, i’m really struggling with dimensionality!)
Thanks for any tips or advice!
P.s. I’ve also posted this question on stackoverflow, i’ll keep these posts updated. Thanks again!