'Could not broadcast dimensions' / mult-class classification

Firstly, I’d like to mention that I am a beginner in Bayesian statistics and PyMC. Please excuse any misunderstandings.

While studying textbooks on Bayesian statistics and machine learning, I came across the MCMC method. As a challenge, I decided to use MCMC to classify the Iris dataset, a standard toy dataset in machine learning, though I’m aware it can also be approached with SVM or other algorithms. This choice was intentional for my future research

I successfully implemented binary classification, but I’m encountering difficulties with multi-class classification (three or more classes).

full source code (Google Colaboratory):

import pymc as pm

with pm.Model() as model:

    # data
    X_data = pm.Data('X_bias', X_train_bias, mutable=False)
    y_data = pm.Data('y', y_train_cat, mutable=False)

    # prior distribution of w
    w = pm.Normal('w', mu=0, sigma=10, shape=(X_train_bias.shape[1], y_train_cat.shape[1]))

    # 
    y = pm.math.dot(X_data, w)
    pred = pm.math.softmax(y)

    # likelihood
    likelihood = pm.Categorical('likelihood', p=pred, observed=y_data)

    # run sampling
    trace = pm.sample()

    # it occurs error, why?

I’ve been attempting to resolve the ‘Could not broadcast dimensions’ error, but despite various efforts, the issue remains unresolved.

This might not be a typical usage of PyMC, but any assistance would be greatly appreciated!

I suggest you call pm.draw on intermediate variables to check their shapes are what you expect. Such as pm.draw(pred).shape.

Note that the categorical imposes that the probabilities be on the last axis. If you have 3 categories and 10 events with distinct probabilites, p should have shape 10,3.

There is some general discussion about distribution dimensions and tips that you may find useful: Distribution Dimensionality — PyMC 5.10.0 documentation

Thanks a lot! I realized some mistakes there. I’ve fixed.

However, there is still another problem, from the error, it appears that the error occurred because the shapes mismatch.
but when I checked the shapes of ‘pred’ and ‘y_data’ just before this, they were both (150, 3).

What exactly happening?

import pymc as pm

with pm.Model() as model:

    # data
    X_data = pm.Data('X_bias', X_train_bias, mutable=False)
    y_data = pm.Data('y', y_train_cat, mutable=False)

    # prior distribution of w
    w = pm.Normal('w', mu=0, sigma=10, shape=(X_train_bias.shape[1], y_train_cat.shape[1]))

    #
    y = pm.math.dot(X_data,w)
    pred = pm.math.softmax(y,1)
    print('y_data', pm.draw(y_data).shape)
    print('pred',pm.draw(pred).shape)

    # likelihood
    likelihood = pm.Categorical('likelihood', p=pred, observed=y_data)
    print(pm.draw(likelihood))
    
    # run sampling
    # trace = pm.sample()

y_data (150, 3)
pred (150, 3)


ValueError Traceback (most recent call last)
/usr/local/lib/python3.10/dist-packages/pytensor/compile/function/types.py in call(self, *args, **kwargs)
969 outputs = (
→ 970 self.vm()
971 if output_subset is None

20 frames
ValueError: shape mismatch: objects cannot be broadcast to a single shape. Mismatch is between arg 0 with shape (150,) and arg 1 with shape (150, 3).

During handling of the above exception, another exception occurred:

ValueError Traceback (most recent call last)
/usr/local/lib/python3.10/dist-packages/numpy/lib/stride_tricks.py in _broadcast_shape(*args)
420 # use the old-iterator because np.nditer does not handle size 0 arrays
421 # consistently
→ 422 b = np.broadcast(*args[:32])
423 # unfortunately, it cannot handle 32 or more arguments directly
424 for pos in range(32, len(args), 31):

ValueError: shape mismatch: objects cannot be broadcast to a single shape. Mismatch is between arg 0 with shape (150,) and arg 1 with shape (150, 3).
Apply node that caused the error: categorical_rv{0, (1,), int64, True}(RandomGeneratorSharedVariable(<Generator(PCG64) at 0x787CC4302260>), [150 3], 4, Softmax{axis=1}.0)
Toposort index: 3
Inputs types: [RandomGeneratorType, TensorType(int64, shape=(2,)), TensorType(int64, shape=()), TensorType(float64, shape=(150, 3))]
Inputs shapes: [‘No shapes’, (2,), (), (150, 3)]
Inputs strides: [‘No strides’, (8,), (), (24, 8)]
Inputs values: [Generator(PCG64) at 0x787CC4302260, array([150, 3]), array(4), ‘not shown’]
Outputs clients: [[‘output’], [‘output’]]

HINT: Re-running with most PyTensor optimizations disabled could provide a back-trace showing when this node was created. This can be done by setting the PyTensor flag ‘optimizer=fast_compile’. If that does not work, PyTensor optimizations can be disabled with ‘optimizer=None’.
HINT: Use the PyTensor flag exception_verbosity=high for a debug print-out and storage map footprint of this Apply node.

y_data shouldn’t have that last dimension. The categorical returns a single item from the 3 possible ones. Unlike a Multinomial with n=1 which is like a hot encoded categorical

1 Like

As you said, there was a problem with y_data.
Thank you so much!