Likelihood Specification and DensityDist

Please see screenshot below for question :slight_smile:

Hi,
You can find some of the details about DensityDist here:Probability Distributions in PyMC3 — PyMC3 3.11.2 documentation.

It requires the total log probability. This is not the log-likelihood, it is just the log probability evaluated at some value(s).

You can then use the custom distribution when specifying a likelihood. In this case, the values you pass are the observed values and the defined distribution takes on the role of the likelihood.

If you were to pass in any value of a random variable from that distribution, it would provide the log probability of obtaining that variable.

1 Like

Ok, thanks Richard! I was using an approximation to the complex likelihood in my model, but this means I can just give pm.DensityDist() the custom kernel log[f(x)] that I’m working with.

Can you please clarify this paragraph :

“You can then use the custom distribution when specifying a likelihood. In this case, the values you pass are the observed values and the defined distribution takes on the role of the likelihood”

Also, given that I provide the callable function log[f(x)] to pm.DensityDist(), how exactly does PyMC3 build the log-likelihood in the sampler ?

What I meant was that the custom distribution is general, it is just a density function. There is nothing in PyMC that `converts’ it to a likelihood function, as such.

If you had fixed values of model parameters \theta, in principle you could use your custom distribution to provide the probability of seeing random variable x (or potential observations), i.e. P(x|\theta).

It only becomes a likelihood function because you are providing observed data, D and estimating the likelihood of seeing that data given some model parameters, P(D|\theta) (D is not a random variable anymore, but is fixed).

To calculate the log posterior probability, the log likelihood is added to the log prior(s). The sampler picks out values of parameters, passes those to the log likelihood (which has the observed data passed to it as well) and the log prior(s), hence calculates the log posterior. I would recommend reading something like Statistical Rethinking (McElreath) or Doing Bayesian Data Analysis (Krushcke), which provide a good insight into how posteriors can be estimated. @RavinKumar also has a book coming out soon about Probabilistic Programming Languages.

2 Likes