Frequently Asked Questions

  • I am getting shape errors when doing posterior predictive sampling after calling set_data

By default observed variables take their shape from the shape of the observed data. This is a convenience, so users don’t have to specify shape manually (via shape argument or dims: see Distribution Dimensionality — PyMC 5.5.0 documentation)

import pymc as pm

with pm.Model() as m:
  x = pm.MutableData("x", [0.0, 1.0, 2.0])
  data = pm.ConstantData("data", [0.0, 2.0, 4.0])

  beta = pm.Normal("beta")
  mu = beta * x
  y = pm.Normal("y", mu, observed=data)
  # This is equivalent to
  # y = pm.Normal("y", mu, observed=data, shape=data.shape)

  idata = pm.sample()

with m:
  pm.set_data({"x": [0.0, 1.0, 2.0, 3.0]})
  pm.sample_posterior_predictive(idata)
ValueError: shape mismatch: objects cannot be broadcast to a single shape.  Mismatch is between arg 0 with shape (3,) and arg 1 with shape (4,).

This happens because y has a fixed shape of (3,).
The error can be obtained directly from numpy like this:

import numpy as np

np.random.normal([0.0, 1.0, 2.0, 3.0], size=(3,))

The recommended fix is to specify how the shape of y depends on the parameter we intend to change

import pymc as pm

with pm.Model() as m:
  x = pm.MutableData("x", [0.0, 1.0, 2.0])
  data = pm.ConstantData("data", [0.0, 2.0, 4.0])

  beta = pm.Normal("beta")
  mu = beta * x
  y = pm.Normal("y", mu, observed=data, shape=mu.shape)

  idata = pm.sample()

with m:
  pm.set_data({"x": [0.0, 1.0, 2.0, 3.0]})
  pm.sample_posterior_predictive(idata)

Examples available in the docs: pymc.set_data — PyMC 5.5.0 documentation

1 Like