Difference between DensityDist and Potential?

kpei · August 31, 2017, 2:26pm

I’m currently using the Potential method to define my custom likelihood. However, I can also use Densitydist and I would like to know the difference. This thread on SO explains (if I understood correctly) CDP’s use as a way to use mock observations generated by a normal rather than having to specify actual observed data. I also briefly recall fonnesbeck mentioning that that pm.Potential does not get its own node within the graph. In Pymc2, it was used to constrain the likelihood.

My question is what is the difference between Potential and DensityDist? If there is a difference, how does it affect the sampling speed and results (if any)?

aseyboldt · August 31, 2017, 3:39pm

A potential allows you to add an arbitrary term to the logp function, without adding new variables to the model. It is mostly a way to cheat your way around limitations of pymc3, where the current distributions don’t allow you to do what you want easily. An example of this are some forms of truncation or censoring that aren’t supported directly atm. If you can do what you want using a DensityDist, then that is usually the best way forward.

junpenglao · August 31, 2017, 4:19pm

There is also more information in the /examples/factor_potential.py, please see below:

github.com

pymc-devs/pymc3/blob/f9917a2b021dc84215ca0ea9ac2f88e0e7f34a74/pymc3/examples/factor_potential.py#L3-L8


"""
You can add an arbitrary factor potential to the model likelihood using 
pm.Potential. For example you can added Jacobian Adjustment using pm.Potential
when you do model reparameterization. It's similar to `increment_log_prob` in 
STAN.
"""

In short, the official use case of pm.potential is comparable to increment_log_prob in Stan (I think in the new version you can just do _logp +=?), whereas pm.DensityDist is to build your own distribution (also fine if non-normalized and only up to a constant).

kpei · August 31, 2017, 6:25pm

Correct me if I’m wrong but this sounds like potential serves as an added term to the logp. Kind of in software programming you have:

func(): doSomething();
subfunc(): super(func); dosomethingextra();

Where the potential is equal to the subfunc. If my likelihood in the model is defined as the Potential (with no other observed= fields anywhere else) then that is the entire likelihood right? sorry if that doesn’t make much sense, I have a basic sense of how samplers work (and mainly metropolis). What I want to ask is, are the two code the same?

with pm.Model():
    a = pm.Normal('a', mu=0, sd=1, observed=y)

vs

with pm.Model():
    a = pm.Potential('a', pm.Normal.dist(mu=0, sd=1).logp(y))

aseyboldt · September 1, 2017, 8:12am

They are mostly the same, and from the point of view of the sampler they are exactly the same.
The differ when you consider functions like sample_ppc, that ask for all observed variables (‘a’ in the first example, but none in the second)

junpenglao · September 1, 2017, 8:55am

I think in pm.sample_ppc you can specify the var - it might work also for pm.Potential

[EDIT]: @aseyboldt is right, if you use pm.potential you can not do sample_ppc

aseyboldt · September 1, 2017, 9:02am

@junpenglao That shouldn’t work, Potentials don’t have a random method

Topic		Replies	Views
DensityDist in PyMC4 PyMC4	2	982	February 12, 2020
Likelihood Specification and DensityDist Questions doc	3	997	October 5, 2021
Custom Distribution with pm.CustomDist v5 modeling	4	1699	June 27, 2023
Doubt in Gaussian Mixture Model notebook in docs Questions doc	2	735	December 31, 2017
Likelihood evaluation and DensityDist Questions	2	1692	February 21, 2018

Difference between DensityDist and Potential?

Related topics