In my dataset, the observations are grouped under a few categories (`sample_type`

in the code below), and I want to find clusters within each category. I also believe that some clusters may appear in more than one category. I assumed a Dirichlet distribution for within-cluster variation. With that in mind, I started building my model:

```
n, k, c, t = 30, 6, 2, 5
sample_type = np.random.choice(range(t), n)
with pm.Model() as model:
cluster_profiles = pm.Exponential("Cluster profile ratio", 1, shape=(k,c))
cluster_weights = pm.Dirichlet("Cluster weights", np.ones((t,c))/2, shape=(t,c))
components = pm.Dirichlet.dist(a=cluster_profiles, shape=(k, c))
```

Where `k`

is the dimensionality of my observations, `n`

is how many observations I have, `c`

is how many clusters I am looking for and `t`

is the number of possible categories. So far so good. Then, I added the mixture part:

```
with model:
pm.MixtureSameFamily("Tumor-based prior", w=cluster_weights[ttype], comp_dists=components, shape=(n,k))
```

But it throws a `ValueError: Input dimension mis-match. (input[0].shape[0] = 30, input[1].shape[0] = 6)`

. I believe it is because `cluster_weights`

is supposed to be have shape `c`

, not `(n, c)`

. If that is the case, how can I implement a mixture with observation-specific weights? Am I using the Mixture module correctly?

I am using PyMC3 v3.11.2 on a Google Colab Linux instance. I have already tried replacing `pm.MixtureSameFamily`

by `pm.Mixture`

, but it did not help.

Thanks!