Help with setting up a hierarchical GLM model

I’m trying to set up a Poisson GLM model which is somewhat hierarchical. I read through quite a few blog posts but couldn’t really manage to understand how to apply that to my case. Any help would be greatly appreciated!

A simplified version of my problem is the following:
I’m trying to fit an auto insurance portfolio with claim counts and exposure. I have several vehicle characteristics, for this limited example lets focus on car model (Model), Vehicle Class and Vehicle Size and an exposure column called simply exposure. Each car Model is associated with a vehicle class and vehicle size.

I would like to fit the following model, which in essence says that the car model frequency mean is a function of both vehicle class and vehicle size (and possibly an interaction between them).

α_class ~ Normal(0, 0.3)
α_size ~ Normal(0, 0.3)
α_model ~ Normal(α_class + α_size, σ_model)
ln(μ) = α_model + offset(ln(exposure))

claims ~ Poisson(μ)

My problem starts with trying to correctly juggling all of the dimensions of the problem, especially defining the α_model prior.

One of the trials I made looks like this:

coords = {
size_codes =
class_codes =

with pm.Model(coords=coords) as m:
   α_class = pm.Normal("α_class", 0, 0.3, dims="class")
   α_size = pm.Normal("α_class", 0, 0.3, dims="size")
   # What I want to do that wouldn't work
   # because α_model should have dims='models' 
   # but α_class[class_codes] and α_size[size_codes
   # have the number of observations as shape.
   α_model = pm.Normal("α_model", α_class[class_codes] + α_size[size_codes], 0.3)
   # I also tried to come up with a mapping class -> model & size -> model
   class_to_model: dict = ...
   α_model = pm.Normal("α_model", α_class[class_to_model[class_codes]] + ...)
   # but again the dictionary translation will have the size of the number of observations.

   # The rest of the code would be something like:
   claims = pm.Poisson("claims", at.exp(α_model + at.log(df.exposure)),

I have made other attempts as well, but for some reason I can’t wrap my head around how to implement this properly.
Any help would be greatly appreciated!

Hi @omrihar.

My Yoda-like recommendation (to maximise learning) would be to try to implement the simplest model you can (e.g. no hierarchy, one predictor variable). Once you’ve got that working, it is much easier to make tiny jumps in complexity to achieve the model you want.

size and class order are mixed up here

coords = {

This line indicates that you’d probably benefit from taking a look at the way how indexing is done in other models

α_model = pm.Normal("α_model", α_class[class_to_model[class_codes]] + ...) 

The pseudocode could perhaps do with a bit more thinking about… everything in there apart from exposure and σ_model is a prior, and I suspect that the data needs to make more of an appearance.

α_class ~ Normal(0, 0.3)
α_size ~ Normal(0, 0.3)
α_model ~ Normal(α_class + α_size, σ_model)
ln(μ) = α_model + offset(ln(exposure))
claims ~ Poisson(μ)

Hope this might be of some use?

Hi @drbenvincent,

thanks for replying! I did build much smaller models and am able to fit them very well. I’m now experimenting with this more complex one and I find it quite hard to explain exactly what I’m after, even though in my mind it is pretty clear.
In most of the examples I saw, usually the hyper-priors all come from the same grouping, in which case the indexing is straightforward. Here, however, because I want to include multiple hyperpriors with different dimensionality I’m getting a semi brain-melt.

Now, however, I think I may have figured it out. Since there is a unique mapping model → size and model → class, I just need to build a smaller dataframe with one instance of model per row, then when I create the priors for model, use this mapping instead of my large training set.

I find that I have been struggling quite a lot to understand the vectorization processes in pymc, in comparison with Turing.jl or Stan, where a simple for loop to build the likelihood is much easier to program.

Having said that, I prefer to use pymc and python since it’s my current native programming language, and it is much easier to install on cloud compute instances such as databricks than stan, which is why I make a point of trying to build arbitrarily complex bespoke models in pymc…

I’ll report back if/when I figure this out :slight_smile:

I highly recommend using pm.model_to_graphviz(model) to visualise the DAG, and perhaps to physically draw out the way how you want it to be then work on the code. It should hopefully ‘click’ at some point and then you’ll find it easy :slight_smile:

Thanks :slight_smile:

I do that a lot as well - and use prior predictive checks not only to test the prior relative to the observed, but also to make sure the dimensionality works out…

Turns out I was right, and the way to do it was something along the lines of:

m_to_c_s = (
   .groupby(['model', 'vehicle_class', 'vehicle_size'])


with pm.Model(coords=coords):
    α_class = pm.Normal("α_class", 0, 0.3, dims='class')
    α_size = pm.Normal("α_size", 0, 0.3, dims='size')
    α_model = pm.Normal("α_model", 
          + α_class[], 
          0.3, dims="model")

Which practically arranged everything correctly. Now it all works as expected, since I can index into α_model[] as expected.

Thanks for the responses and rubber-ducking :wink: