Hello all,
I am new to PYMC and to Bayesian statistics in general. I am trying to figure out how to fit a model that I have been previously doing in two steps with a frequentest approach. I am working with media consumption data. I am trying to project total consumption 8 months out using data from the first 35 days. The media consumption data seems to largely follow an exponential decay pastern. By log transforming and fitting a line I can model the decay and extrapolate into the future. Looking at historical data this extrapolation is almost always an under estimation, but it correlates really well with the total consumption at 8 months. So I have fit a second linear model under log - log transform.
Frequentist Method:
Model 1: log(daily consumption) = m1 * day + b1 #done individually for each video
Model 2: log(8 months total consumption) = m2 * log(extrapolated total consumption) + b2
One of the motivations for doing this is to answer the question, “How likely is video A to outperform video B at the end of 8 months given the data.” That is in my opinion squarely in the domain of Bayesian statistics. Because of this I am brushing up on my Bayesian statistics and tying to convert this method into a Bayesian model.
Using bambi, I have been able to fit the equivalent (or better) of each of these models. The issue that I see is that I am loosing a lot of data when I move from Model 1 to Model 2. I can see many different approaches to the transition, but they all involve collapsing the posterior distribution of Model 1 into a single value. What I would like to do instead is to either fit one combined model or provide the transformed posteriors of Model 1 to inform Model 2.
I think fitting a combined model would be superior even if it is more computationally expensive. The challenge that I see is that there is a lot of gymnastics in between Model 1 and Model 2. First we are converting from a kind of a time series problem to an area under a curve problem. More importantly is that we transfer out and back into log space. Can NUTS handle something like that? How would I write a combined model? Is this where you would use pm.Deterministic?
The other option is to transform the posteriors of Model 1 and provide them as a prior to Model 2. I have seen
where Kernel Density Estimation is used to construct a prior distribution. I think what I am trying to do is different. I don’t have prior knowledge of most of the parameters in the second model because I am not fitting the same model again. I only have a probability distribution around the independent variable. There is also this idea that each video has its own uncertainty around it. The daily consumption of some videos is better modeled by exponential decay then other videos. The effect of that is we have more confidence about the long term projection for some videos than others. I am not sure how we provide that information to the second model.
I can work on creating a script to generate fake data similar to my data if that would be helpful to the discussion.
Thanks in advance for the help. This forum has already answered a few of my questions, and given me a lot of food for thought.