I have a dataframe df, that has columns country_id, state_id, city_id, date, number of road accidents.
I want to model a bayesian hierarchical model with hierarchy like this country-> state-> city, to get an estimates at city level.
As we can see we can use the posterior of upper level as prior for the lower level. As we are ignoring date, we will have multiple entries even at city level as well.
How to design it in pymc?
I would strongly recommend reviewing the primer on multi-level modeling or the notebook reproducing chapter 9 from Kruschke’s textbook.
Thanks, this helped me in understanding the overall flow but @jessegrabowski pointed to a discussion that is actually important step in defining the hierarchy in pymc.
I actually defined the hierarchy,
The issue I encounter is, one state has 61 entries and another state has 8 entries.
This doesn’t seem correct. I have 2 states: state A has110 data points with mean of .036 of the data.
and state B has 8 data points whose mean comes out to be 0. what could have gone wrong? any idea?
From those lumpy posteriors, it looks like the sampling wasn’t successful. Did you get divergences? What are the rhat statistics for these variables?
You are not going to have much like learning about hierarchical parameters with only 2 groups, one of which has only 8 observations. mu_country
is going to be learned from just 2 datapoints.
I suggest you run some simulation studies to see under what conditions you can and can’t recover true parameter values from this model, and think about whether you need all this hierarchy.
Thanks a lot. @jessegrabowski