Consider a two-level Bayesian hierarchical model. Level-1 has a parameter m1 and level-2 has parameters m1 and n1. I want to independently infer m1 in the first level, and then use the entire inferred distribution of this parameter in the second level to infer just n1.
I do not want to use just the posterior mean of m1 after level-1 of inference, since that would be quite deterministic. Additionally, it will not propagate the uncertainty of parameter m1 into the second level. So my question is: What would be the recommended approach to do this in pymc?
So far from my limited understanding of pymc, what I can think of are the following. For the second level of inference, assigning the entire samples of m1 onto:
Shared variables, or
But both these seem to be intended for different purposes. Any comments on the issue or a recommendation of a better approach would be very helpful. Thanks a lot!
As above, it also sounds to me like you are talking about some manner of hierarchical modelling. I find that the book “Doing Bayesian Analysis” by John Kruschke has some nice examples of this, in particular chapter 9 quite insightful.
Just to be sure though, when you say you want to infer m1 independently on the first level, do you mean this level has its own likelihood that depends on m1 and level2 has some other priors that depend on m1 and its own likelihood and you want to sample these together? Or do you want to first use level 1 model independently to infer the posterior of m1 and then use posterior of m1 in the second model sampling it independently?
If the first case you can always model these two levels together as such:
m1 = prior(some fixed parameters)
m2 = prior(m1, some other fixed parameters)
Otherwise if you are talking about the latter case, you can try to include the derived posterior of m1 in the next model as normal distribution with posterior mean and sd (provided the posterior looks normal enough). To me the first way makes more sense though since likelihood of D2 depends on m2 which depends on m1, D2 could be used to further inform what are the likely values of m1 are. However there are I guess instances where the latter approach could make some sense too.
Wow, thanks a lot for your very insightful reply @iavicenna. I understand that the former statement you mentioned is more of a classical hierarchical model which is also present in most of the documentation available. But, what you mentioned as the latter statement is EXACTLY what I want to do:
Building on your suggestion, the problem that is bothering me is that the distribution of m1 after the first level of inference need not look like a normal distribution. So I would not prefer to assume that. Additionally I feel that, using the posterior mean and sd in that assumed normal distribution would be camouflaging an informed prior in place of the actual distribution of m1. Instead I would like to somehow use the entire posterior of m1 as it is in the second level which would better propagate the uncertainty of m1 into the second level (my data is quite chaotic!).
What I can think of at this stage are the following:
Using a pm.Deterministic() and passing on the sampled values of m1(from level 1) and using it in the second level.
Using a pm.ConstantData() and passing on the sampled values of m1 and using it in the second level.
It would be very helpful if you can let me know what you think. Once again, thanks a lot for your reply
PS: In case you would like to take a better look at the model, I had posted an earlier question (link below), albeit with an older version of pymc and theano.