Dimensionality specification for batch of matrix rv (e.g. Wishart dist)

My problem is in this line: how should I define a batch of K Wishart distribution with d\times d scale matrix?

    taus = pm.Wishart("taus", nu=d, V=S_inv, shape=(K,))

Detailed description:
I’m currently constructing a Dirichlet Process Mixture Model with multivariate normal likelihood and truncation-based definition. To be specific, this is a mixture of K components, each related with a mean parameter mu and inverse covariance matrix tau, and a normal distribution N(mu, tau^{-1}). The data include N observations, with d features.

In my construction of the model, I want each mixture component to have its own distinct mu’s and tau’s. Therefore, I have to specify explicity the batchsize, i.e. the number of components, in the definition of the variable, like this:

    mus = pm.MvNormal("mu", tbar, cov=np.eye(d)/0.1, shape=(K,d))

However, this becomes a little confusing when it comes to the definition of covariance/precision matrix. I’ve tried several ways to specify the shape parameter, but they all lead me to some error. (I’ve tried shape=(K,), which i think will be simply ignored by the program; and also shape=(K,d), shape=(K,d,d), which all raised some error.) I also tried to change the shape of the parameter nu to include the batch information implicitly, which also raised an error. How should I specify this batchsize with matrix-valued random variables like this?

As i tried visualizing the model, which is defined by the chunk below, I found it has observations of dimensionality N\times N, which is not expected. (I’m expecting N\times d). Here is my code:

with pm.Model(coords={"component": np.arange(K), "obs_id": np.arange(N), "feature": np.arange(d)}) as model:
    alpha = pm.Gamma("alpha", 1.0, 1.0)
    w = pm.StickBreakingWeights("w", alpha, K-1)
    
    mus = pm.MvNormal("mu", tbar, cov=np.eye(d)/0.1, shape=(K,d))
    taus = pm.Wishart("taus", nu=d, V=S_inv, shape=(K,))
    category = pm.Categorical("category", p=w, shape=N)

    obs = pm.MvNormal("obs", mu=mus[category], tau=taus[category], observed=df.values)

I’ve been testing the model with a toy example, with 2 real components of 3-dimensional Gaussian distribution

N = 100
K = 30
d = 3
mu1 = -np.ones(d)
mu2 = np.ones(d)
s1 = np.diag([1, 1, 0.01])
s2 = np.diag([1, 0.01, 1])

t1 = np.random.multivariate_normal(mean=mu1, cov=s1, size=(50,))
t2 = np.random.multivariate_normal(mean=mu2, cov=s2, size=(50,))
t = np.concatenate((t1, t2))

df = pd.DataFrame({
    "v1":t[:,0],
    "v2":t[:,1],
    "v3":t[:,2]
    })
# preprocessing
tbar = np.mean(t, axis=0)
D_center = t-tbar
S = np.matmul(D_center.transpose(), D_center) / N
S_inv = np.linalg.inv(S)

Besides, I’ve been wondering whether it’s possible to collapse on the category variable, which is discrete and making it hard for the program to perform variational inference. Will using Mixture in the model help? (However this is not my major concern right now lol, i’m really struggling with dimensionality!)

Thanks for any tips or advice!

P.s. I’ve also posted this question on stackoverflow, i’ll keep these posts updated. Thanks again!

Have you seen Distribution Dimensionality — PyMC dev documentation

Also what error are you seeing? Most of our multivariate distributions logp can’t be batched yet: Increase support for batched multivariate distributions · Issue #5383 · pymc-devs/pymc · GitHub

Thanks! @ricardoV94
Actually I have read through the Distribution Dimensionality documentation, and I thought I’ve understood some basics.

When I define the component-wise precision prior like this, the shape argument is ignored (as I was told in the document):

    taus = pm.Wishart("taus", nu=d, V=S_inv,shape=(K,))

And this definition is already breaking down later on, with

AssertionError: Could not broadcast dimensions

Another attempt to go explicit is like this:

    taus = pm.Wishart("taus", nu=d, V=S_inv,shape=(K,d,d))
    category = pm.Categorical("category", p=w, shape=N)
    obs = pm.MvNormal("obs", mu=mus[category], tau=taus[category], observed=df.values)

But it seems in the definition of obs, I’m not indexing the component assigned to each sample successfully, with a msg:

ValueError: tau must be two dimensional.

And I’m completely stuck here, as the indexing step seems normal (to me). How should I deal with this?

Does “cannot batch” logp mean that I can only stack variables up into a container?

For a distribution whose logp cannot be batched yet, is it a good idea to simply stack the components of a mixture? like this:

taus = pm.math.stack([pm.Wishart.dist(nu=d, V=S_inv) for _ in range(K)])

This will lead me to an inelegant graphic model, and it raises an error

ValueError: tau must be two dimensional.

It seems to me that when I’m trying to index a single component in the taus with tau=taus[category], it is not working.

Is this also a part of the batching problem @ricardoV94 ? Thanks for answering again!