Compute the KL divergence between two distributions

By “VI” you mean “variational inference,” correct?

I’m not sure what exactly is going wrong here, because I’m testing the logp of samples generated using sample_prior_predictive, so there isn’t a VI stage here, the way there is when I use sample with NUTS, is there?

I don’t know how to check the internals of the model to explain this, except to explain that it is a three-level hierarchical model, because it involves measurements of populations, and that the population measures (per other question) is the sum of independent Gaussians. I’ll see about putting the model into a notebook to share.

Is there any chance that the logp simply decays as the number of variables in the model grows, and that beyond a certain number of variables we need to rescale?

Is there any tracing mechanism that would let me take the logp calculation for a single point and print out all of the factors? That would show us if it’s decaying with size as I suggest.

Thanks for any suggestions!