Why do we still need sampling in the Marginal GP implementation?

Hello, I am very new to Pymc so I apologise for the very basic question, but this has been a general long term doubt I have about the Marginal implementation of Gaussian Processes. As I understand, we assume a Gaussian likelihood, so we can calculate the marginal likelihood performing the integral directly. Why do we still need to do sampling to get the posterior distribution, couldn’t we just use Bayes Rule to get the posterior distribution given that we have already calculated the integral in the denominator analytically ?

I understand that we use sampling in general to approximate the posterior not the marginal likelihood, but if we have already its ingredients why do we still need to do sampling ? Please re guide me if I am missing something and thanks very much!

In general, we tend to work with samples even when the posterior has closed form solution. It’s just easier to ask about moments or summaries of interest when you have histograms.

In practice, PyMC is built almost exclusively around mcmc sampling, so it will use that even when not needed. This may be wasteful in some cases but is the most general solution. Your model may have a marginal GP together with other likelihoods and priors / hyperpriors that also need sampling and which wouldn’t necessarily have a known closed form solution. MCMC will handle all these cases naturally.

1 Like