Problem with NormalMixture

Will_Miller · August 30, 2023, 4:37pm

I have a dataset that has a bimodal distribution, and I want to understand what influences the chance of a data point being in one mode or the other. Here’s my model:

# classes = 2, predictors.shape = (129,9) and data.shape = (129,)
with pm.Model(coords={"classes": range(classes), "predictors": predictors.columns}) as class_model:
    # Priors
    sigma = pm.HalfNormal("sigma", sigma=1)
    centers = pm.Normal(
        "centers",
        mu=[data_mean, data_mean],
        sigma=10,
        transform=pm.distributions.transforms.univariate_ordered,
        initval=[1, 2],
        dims="classes"
    )

    intercept = pm.Normal('intercept', mu=0, sigma=20)
    beta = pm.Normal("beta", mu=0, sigma=1, dims="predictors")
    theta = intercept + pm.math.dot(predictors, beta)
    p = pm.Deterministic("p", pm.math.invlogit(theta))
    w = [p, 1-p]

    # Likelihood
    y = pm.NormalMixture("obs", w=w, mu=centers, sigma=sigma, observed=data)

When I run this model, I get this error:
ValueError: Incompatible Elemwise input shapes [(2, 129), (129, 2)]
I’m guessing this happens when making the w array. There are 129 entries in my dataset, but I don’t know where the second dimension is coming from, or why they’re flipped.

I wrote a simpler model without any predictors (just sampling p from a normal distribution) and it worked. I think there’s something I’m missing about the shapes here. Any help would be appreciated.

ricardoV94 · August 30, 2023, 5:00pm

It may help if you can share a reproducible snippet. It’s unclear what is the shape of your data and dims.

Will_Miller · August 30, 2023, 5:22pm

The data is sensitive, so I can’t post a snippet. The shape of the predictors and observed data is (129, 9) and (129, ) respectively (there are nine predictor columns and one observed column). I’ll write some fake data if the shapes aren’t enough. There are 2 classes.

Will_Miller · August 30, 2023, 6:12pm

Here’s a reproducible example of what I’m seeing. This version only uses two predictors, but has the same number of observations.

import pymc as pm
import numpy as np
import pandas as pd

# Create fake data
mus = np.array([4, 14])
sigmas = np.array([2, 3])

predictors = pd.DataFrame({
    "male": pm.draw(pm.Categorical.dist(p=[0.5, 0.5]), draws=129),
    "age": pm.draw(pm.Normal.dist(mu=35, sigma=3), draws=129) 
})

p = pm.draw(pm.Normal.dist(mu=1 + predictors["age"] * 0.25 + predictors["male"] * 1.0))
p = p / (np.max(p) + 0.1)

index = np.array(pm.draw(pm.Categorical.dist(p=np.column_stack((p, 1-p))), draws=1))
data = pm.draw(pm.Normal.dist(mu=mus[index], sigma=2), draws=1)

# Model
classes = 2
with pm.Model(coords={"classes": range(classes), "predictors": predictors.columns}) as model:
    # Priors
    sigma = pm.HalfNormal("sigma", sigma=1)
    centers = pm.Normal(
        "centers",
        mu=[data.mean(), data.mean()],
        sigma=10,
        transform=pm.distributions.transforms.univariate_ordered,
        initval=[1, 2],
        dims="classes"
    )

    intercept = pm.Normal('intercept', mu=0, sigma=20)
    beta = pm.Normal("beta", mu=0, sigma=1, dims="predictors")
    theta = intercept + pm.math.dot(predictors, beta)
    p = pm.Deterministic("p", pm.math.invlogit(theta))
    w = [p, 1-p]

    # Likelihood
    y = pm.NormalMixture("obs", w=w, mu=centers, sigma=sigma, observed=data)

# Train
with model:
    idata = pm.sample()

Will_Miller · August 30, 2023, 6:41pm

I figured it out. Changing the w term to this fixes the issue:
w = pm.math.stack([p, 1-p], axis=1)

Thanks!

twiecki · August 31, 2023, 12:58pm

Thanks for posting the solution!

Topic		Replies	Views
PyMC/PyTensor Error with Multivariate Normal Distribution: Incompatible Elemwise Input Shape v5 modeling	1	599	December 29, 2023
Two different ways to write a mixture model. Question about pm.NormalMixture Questions	8	1110	February 13, 2019
Shape mismatch sample posterior predictive with Binomial v5 shape_issue	2	316	November 15, 2023
Bernoulli Mixture Model - shape mismatch v5 modeling	3	379	July 27, 2023
Shape Issue with Ordered Gaussian Mixture v5 bug , shape_issue , aesara	2	468	October 25, 2022

Problem with NormalMixture

Related topics