Implementation of model's log-probability

Hello,
I’m trying to dive into PyMC3 source code and I’d like to check my understanding of what is done by Model.logpt(). Let (x_1,\ldots,x_n) denote the model’s variables, divided into observable variables \mathbf y and free variables \mathbf z.

This is my guess at what logpt() does:

  1. it’s goal is computing the log of the unnormalized posterior (since this is what we need for MAP and MCMC)
  2. the unnormalized posterior is proportional to the joint after clamping the observed variables to their values: p(\mathbf y|\mathbf z) \propto p(\mathbf z, \mathbf y), so to get the log-unnormalized-posterior it suffices to implement the computation of the joint
  3. the joint probability is computed using the chain rule relative to the DAG defined by the model: p(x_1,\ldots,x_n) = \prod_i p(x_i|\mathrm{pa}_{x_i}) where \mathrm{pa}_{x_i} can include both observed and free nodes
  4. logpt adds together the logs of the local CPDs and clamps the observed variables to their values

For example take a simple DAG z_1 \rightarrow y_1 \rightarrow z_2 and let y_1 = y_1^* denote the observations, then logpt() encodes the function (z_1,z_2) \mapsto \log p(z_1,z_2, y_1^*) =\log p(z_2|y_1^*) +\log p(y_1^*|z_1) + \log p(z_1)

Is this correct? Thanks a lot

(Side question: is FreeRV used as a synonym for “hidden variable”?).

Just noticed I’ve made a typo in point 2, I meant p(\mathbf z|\mathbf y) not p(\mathbf y| \mathbf z).

@lorenzgu thanks for asking this valuable question and sharing your reasoning. I don’t want to make any mistake but I think your reasoning sounds about fine. Have you had a look at the general API quickstart? General API quickstart — PyMC3 3.10.0 documentation

I think it will answer your current questions. Let us know if it does not :slight_smile:

@ricardoV94 thank you for answering. I’ve read the link and maybe I can explain my doubts from a different point of view. I’m trying to learn about Bayesian inference also by understanding how a library like PyMC3 works and how it is designed, so I wanted to see what are the fundamental concepts needed and used in the design of such a library to do Bayesian inference, and sampling in particular… what kind of information and computations they need to work.

My conclusion so far is that, limiting the discssion to sampling and MAP estimation, the essential elements for such a library must be “observed” variables \mathbf z, “unobserved” variables \mathbf y and a way to compute the joint distribution p(\mathbf z, \mathbf y).

Indeed, if the reasoning above is correct, in order to sample from the “posterior” p(\mathbf z|\mathbf y^*) such a library only needs to know how to compute the joint, because if you want to sample the posterior, as a function of the free variables it is proportional to the joint \mathbf z \mapsto p(\mathbf z, \mathbf y^*) \propto p(\mathbf z|\mathbf y^*). Moreover the joint distribution, once the probability densities are programmed, can be computed by the chain rule for DAGs. My question is basically if that’s what Model.logpt() actually does.

Your understanding is correct - for a brief outline of how PyMC3 collects the log_prob from each random variable you can find it in the developer guide: PyMC3 Developer Guide — PyMC3 3.10.0 documentation

Thank you @junpenglao… it wasn’t obvious to me at first. Thanks for the reference as well :slight_smile: