"Empirical Bayes" w/PyMC3

Not sure if the headline is the right way to describe my question, but I"m a little confused about how to approach this product mix project I’m working on. Basically, I’m trying to get a better estimate of this year’s product mix in light of the product mix we saw last year. So:

  • I’d like to use actual historical data to strongly inform my priors of a beta distribution, e.g. last year’s product mix by month
  • Then, use this year’s observed data to derive a posterior distribution I can sample from
    I have used this empirical bayes approach using plain Python and Scipy in the past, and I’d like to use PyMC3 instead.
    Am I thinking of this correctly? Does PyMC3 even have a place here? Is this just a matter of using prior year’s actual data to set the alpha/beta params of the Beta distributed priors?

That sounds fine to me, but why not model all data together (i.e., data from last year and this year) with a weakly informative prior?

The mix is very seasonal, i.e. it changes from week to week. So, I want to capture last’s year’s mix by week of year as the prior, then use this year’s observed mix to update the prior. Does that make sense?
I’m thinking of this a bit like modeling baseball batting averages from season to season but by game week, not just across the whole season.

So you will train 2 models? One using data from last year, and then use the posterior to get new prior to build new model for data from this year.
I still feel like you can model everything in one model

1 Like

I think you’re right, it’s not always initially intuitive to me that you’d combine those and use indexing efficiently (similar to answer re: my hierarchical model the other week). I’ll try out a few things here, thx!

1 Like