I have a dataset that has a bimodal distribution, and I want to understand what influences the chance of a data point being in one mode or the other. Here’s my model:

```
# classes = 2, predictors.shape = (129,9) and data.shape = (129,)
with pm.Model(coords={"classes": range(classes), "predictors": predictors.columns}) as class_model:
# Priors
sigma = pm.HalfNormal("sigma", sigma=1)
centers = pm.Normal(
"centers",
mu=[data_mean, data_mean],
sigma=10,
transform=pm.distributions.transforms.univariate_ordered,
initval=[1, 2],
dims="classes"
)
intercept = pm.Normal('intercept', mu=0, sigma=20)
beta = pm.Normal("beta", mu=0, sigma=1, dims="predictors")
theta = intercept + pm.math.dot(predictors, beta)
p = pm.Deterministic("p", pm.math.invlogit(theta))
w = [p, 1-p]
# Likelihood
y = pm.NormalMixture("obs", w=w, mu=centers, sigma=sigma, observed=data)
```

When I run this model, I get this error:

`ValueError: Incompatible Elemwise input shapes [(2, 129), (129, 2)]`

I’m guessing this happens when making the `w`

array. There are 129 entries in my dataset, but I don’t know where the second dimension is coming from, or why they’re flipped.

I wrote a simpler model without any predictors (just sampling `p`

from a normal distribution) and it worked. I think there’s something I’m missing about the shapes here. Any help would be appreciated.

It may help if you can share a reproducible snippet. It’s unclear what is the shape of your data and dims.

The data is sensitive, so I can’t post a snippet. The shape of the predictors and observed data is `(129, 9)`

and `(129, )`

respectively (there are nine predictor columns and one observed column). I’ll write some fake data if the shapes aren’t enough. There are 2 classes.

Here’s a reproducible example of what I’m seeing. This version only uses two predictors, but has the same number of observations.

```
import pymc as pm
import numpy as np
import pandas as pd
# Create fake data
mus = np.array([4, 14])
sigmas = np.array([2, 3])
predictors = pd.DataFrame({
"male": pm.draw(pm.Categorical.dist(p=[0.5, 0.5]), draws=129),
"age": pm.draw(pm.Normal.dist(mu=35, sigma=3), draws=129)
})
p = pm.draw(pm.Normal.dist(mu=1 + predictors["age"] * 0.25 + predictors["male"] * 1.0))
p = p / (np.max(p) + 0.1)
index = np.array(pm.draw(pm.Categorical.dist(p=np.column_stack((p, 1-p))), draws=1))
data = pm.draw(pm.Normal.dist(mu=mus[index], sigma=2), draws=1)
# Model
classes = 2
with pm.Model(coords={"classes": range(classes), "predictors": predictors.columns}) as model:
# Priors
sigma = pm.HalfNormal("sigma", sigma=1)
centers = pm.Normal(
"centers",
mu=[data.mean(), data.mean()],
sigma=10,
transform=pm.distributions.transforms.univariate_ordered,
initval=[1, 2],
dims="classes"
)
intercept = pm.Normal('intercept', mu=0, sigma=20)
beta = pm.Normal("beta", mu=0, sigma=1, dims="predictors")
theta = intercept + pm.math.dot(predictors, beta)
p = pm.Deterministic("p", pm.math.invlogit(theta))
w = [p, 1-p]
# Likelihood
y = pm.NormalMixture("obs", w=w, mu=centers, sigma=sigma, observed=data)
# Train
with model:
idata = pm.sample()
```

I figured it out. Changing the `w`

term to this fixes the issue:

`w = pm.math.stack([p, 1-p], axis=1)`

Thanks!

5 Likes

Thanks for posting the solution!