Adding a monthly effect with a GaussianRandomWalk without triggering shape matching error

Hi there! I am trying to model the price elasticity of different products, while controlling for seasonality.

Here is how one can generate the data:

import numpy as np
import pandas as pd

np.random.seed(42)

n_months = 12
n_floorplans = 3
n_obs = n_months * n_floorplans

data = {
    "month_id": np.repeat(range(1, n_months+1), n_floorplans),
    "product": np.tile(["A", "B", "C"], n_months),
    "demand_A": np.random.poisson(100, n_obs),
    "demand_B": np.random.poisson(80, n_obs),
    "demand_C": np.random.poisson(60, n_obs),
    "price_A": np.random.normal(1000, 50, n_obs),
    "price_B": np.random.normal(1200, 70, n_obs),
    "price_C": np.random.normal(1500, 100, n_obs),
}

df = pd.DataFrame(data)

The model computes alpha and beta to fit demand = exp(alpha + beta * (log(price) - log(avg_price))) for each product.

And the seasonality is modeled using a Gaussian random walk. So, it’s an array of 12 values. Here is the full model:

import pymc as pm
import pytensor.tensor as pt

month_idx, month_labels = df["month_id"].factorize(sort=True)
product_idx, product_labels = df["product"].factorize(sort=True)

coords = {
    "products": product_labels,
    "months": month_labels,
}
with pm.Model(coords=coords) as funnel:
    # define a monthly trend, as a random walk
    month_effect = pm.GaussianRandomWalk('month_effect', init_dist=pm.Exponential.dist(0.25), dims="months")
    
    # estimates the elasticity coefficients
    alpha = pm.Cauchy('alpha', 1, 2, dims="products")
    beta = pm.Cauchy('beta', -3, 2, dims="products")

    price_columns = [f"price_{p_name}" for p_name in product_labels]

    average_prices = df[price_columns].mean(axis=0)
    logμ0 = alpha + (beta * (np.log(df[price_columns]) - np.log(average_prices)))

    μ0 = np.exp(logμ0) + month_effect[month_idx]

    demand_columns = [f"demand_{p_name}" for p_name in product_labels]
    demand_obs = pm.Poisson('demand_obs', μ0, observed=df[demand_columns])

    idata = pm.sample()

When I attempt to sample without + month_effect[month_idx], it works well.
Now, after adding that seasonality component back, it throws:

ValueError: Input dimension mismatch. One other input has shape[1] = 3, but input[3].shape[1] = 36.

The way I understand it, it should work because:

  • μ0 = np.exp(logμ0) has as many items than the number of data points in the dataset, since it’s fed into a distribution with observed set to the dataset.
  • month_effect[month_idx] should have the same shape, since month_idx has as many items than df.shape[0]

What am I missing here?

(pymc==5.1.2)

Looks like it’s trying to broadcast the month effect across the product dimension, not the month dimension (hence the shape error – 3 produces vs 36 months). You can help it along by adding a dimension to month_effect, to make it (T, 1) rather than (T,). This removes ambiguity about how the addition should be broadcast:

μ0 = np.exp(logμ0) + month_effect[month_idx, None]

1 Like

Thank you so much, that solved the issue!

This looks like a cool model @nlassaux . Would you consider writing it up as an example (with a small bit of explanation and some plots) under the case studies or time series section here? PyMC Example Gallery — PyMC example gallery
If so, see the contributing guide here GitHub - pymc-devs/pymc-examples: Examples of PyMC models, including a library of Jupyter notebooks.

1 Like