Hierarchical model with multiple levels

Hi there!

I am currently trying to build a model with several hierarchies but I am running into issues that I am not sure how to deal with.
Data:

Familia	        Articulo    pdiff	 Uds
0	Cabello	154490	0.08	21.0
1	Cabello	171984	0.12	21.0
2	Cabello	123498	0.16	18.0
3	Cabello	126179	0.05	16.0
4	Cabello	58309	0.38	16.0
...	...	...	...	...
521288	Cabello	112320	-0.22	0.0
521289	Cabello	112304	0.00	0.0
521290	Cabello	190452	0.31	0.0
521291	Cabello	60284	0.30	0.0
521292	Cabello	192205	0.05	0.0

The data hierarchy is as follows:

  1. Family.
  2. Product.
  3. coefficient.
  4. sales at coefficient.

The aim is to estimate demand for N poisson distributions using samples from the posterior (lambda). This would be done for each product(art) and coffient(pdiff), hence estimating sales at different coefficients.

A simple model only looking at the last 3 levels could be defined as follows:

with pm.Model(coords=coords) as basic_model:
    alpha_art = pm.Uniform("alpha_art", lower=0, upper=50)
    beta_art = pm.Uniform("beta_art", lower=0, upper=50)
    lbda = pm.Gamma('lbda', alpha=alpha_art, beta=beta_art, dims='pdiff')
    y = pm.Poisson('y', mu=lbda[idx], observed = data['Uds'])
    idata_hier = pm.sample()

Where coords is defined as follows:

cat_encode = pd.Categorical(data['pdiff'])
idx = cat_encode.codes
coords = {"pdiff": cat_encode.categories}

This model works fine and we can obtain a posterior for each coefficient (pdiff) for a specific product.

The problem arises when I try to add many more products from the same family. Assume that the priors alpha_a, … ,beta_b are informative at a family level:

with pm.Model(coords=coords) as allprod_model:
    alpha_a = pm.Uniform("alpha_a", lower=0, upper=50)
    beta_a = pm.Uniform("beta_a", lower=0, upper=50)
    alpha_b = pm.Uniform("alpha_b", lower=0, upper=50)
    beta_b = pm.Uniform("beta_b", lower=0, upper=50)
    alpha_art = pm.Gamma("alpha_art", alpha=alpha_a, beta=beta_a, dims = 'art')
    beta_art = pm.Gamma("beta_art", alpha=alpha_b, beta=beta_b, dims = 'art')
    lbda = pm.Gamma('lbda', alpha=alpha_art[art_idx], beta=beta_art[art_idx], dims=('art', 'pdiff'))
    y = pm.Poisson('y', mu=lbda[art_idx,pdiff_idx], observed = data['Uds'])
    idata_allarts_hier = pm.sample()

Having coords:

art_encode = pd.Categorical(data['Articulo'])
art_idx = art_encode.codes
coords= {"art": art_encode.categories}
cat_encode = pd.Categorical(data['pdiff'])
pdiff_idx = cat_encode.codes
coords['pdiff'] = cat_encode.categories

When sampling this model I get this error:

ValueError: could not broadcast input array from shape (1,521293) into shape (1137,280)
Apply node that caused the error: Alloc(Second.0, Cast{int64}.0, Cast{int64}.0)
Toposort index: 13
Inputs types: [TensorType(float64, shape=(1, None)), TensorType(int64, shape=()), TensorType(int64, shape=())]
Inputs shapes: [(1, 521293), (), ()]
Inputs strides: [(4170344, 8), (), ()]
Inputs values: ['not shown', array(1137, dtype=int64), array(280, dtype=int64)]
Inputs type_num: [12, 9, 9]
Outputs clients: [[Log(Alloc.0)]]

After some debugging I know where this is coming from. The model is expecting to have as many observations in the data as pdiff x art combinations, however in the data there are rows that have the same pdiff and art (they represent sales at different dates). One could say that the solution would be grouping but part of the reason I am modeling this with a poisson likelihood is that the data is provided in units sold per day (per art and pdiff).

Any ideas on how to proceed?

A possible solution that comes to mind is to first build a model aggregating data at a family level and use the posterior as the informative prior for the product level model. Then run one of such model per product.

I am following Osvaldo Martin’s book, but it doesnt provide examples of many hierharchical levels, at least up to chapter 4…

Regards and thanks in advance,
Gabriel.

Welcome!

I think this is “just” a shape error. No need to restructure your model (and certainly no need to consider altering your data).

I think the line that is causing the issue is this:

lbda = pm.Gamma('lbda', alpha=alpha_art[art_idx], beta=beta_art[art_idx], dims=('art', 'pdiff'))

You are asking for a parameter array of dimension ('art', 'pdiff') (2 dimensional), but specifying the parameters of the gamma distribution to be of shape art_idx.shape (1 dimensional). So PyMC doesn’t know how you want to get the 1D vector of alphas and betas to broadcast to the 2D parameter array. PyMC (and PyTensor under the hood) uses numpy broadcasting, so you can use that as a reference. I assume you want something like this?

lbda = pm.Gamma(
    "lbda", alpha=alpha_art[art_idx, None], beta=beta_art[art_idx, None], dims=("art", "pdiff")
)

Hi there,

Thanks for the answer, that actually makes a lot of sense.

I will give it a go tomorrow morning with a data sample. I tried running it on the full dataset but it takes forever.

I will get back to you!

Thanks again,
Gabriel.

1 Like

When dealing with shape errors, the best strategy (in my experience) is to work with a cut down data set that has the same dims. You can iterate faster that way then scale back up once you have a solution.

Hi again!

I tried it out and it sill threw an error regarding broadcasting lambda.

I gave it some thought and, correct me if I am wrong, we only need to index the data on the likelihood. Also I realized that what I actually want is that all articles within a family(the dataset I am using) share a prior of what I call pdiffs, just assume these are coefficients regarding sales.

This way if for a specific article I do not have information on a particular pdiff, it can use the average distribution given the hyperprior. Is that assumption right?

with pm.Model(coords=coords) as allprod_model:
    alpha_a = pm.Uniform("alpha_a", lower=0, upper=50)
    beta_a = pm.Uniform("beta_a", lower=0, upper=50)
    alpha_b = pm.Uniform("alpha_b", lower=0, upper=50)
    beta_b = pm.Uniform("beta_b", lower=0, upper=50)
    alpha_pdiff = pm.Gamma("alpha_pdiff", alpha=alpha_a, beta=beta_a, dims = 'pdiff')
    beta_pdiff = pm.Gamma("beta_pdiff", alpha=alpha_b, beta=beta_b, dims = 'pdiff')
    lbda = pm.Gamma('lbda', alpha=alpha_pdiff[None, :], beta=beta_pdiff[None, :], dims=('art', 'pdiff'))
    y = pm.Poisson('y', mu=lbda[art_idx,pdiff_idx], observed = data['Uds'])
    idata_allarts_hier = pm.sample(draws=4000, tune=4000, nuts_sampler='nutpie')

I compared the results with a model at an article level and in some pdiffs for the same article one can see the shrinkage that should happen when using hierarchical models. Also for each article I have all pdiffs, with can prove useful to estimate sales at a particular pdiff.

I also changed the sampler since it was a bit too slow, it still is unpractical as it takes 90 minutes for 200 articles and I have approximately 30k, but at least is an improvement. I assume that by choosing more informative priors using domain knowledge and maybe changing other things it might improve futher.

Any thoughts/corrections would be highly appreciated.

Kind regards,
Gabriel.