I have tried to create a new random variable class, for a variable developed by Miller (1980) that provides the conjugate prior for a gamma RV with unknown shape and scale (see Fink’s Compedium of Conjugate Priors).
My class, MillerDist
, inherits from DensityDist
, adding a few new slots and a logp
method.
However, I am getting a bad result when I try to instantiate the variable using my __init__
method with the class MillerDist
. When I invoke the constructor (MillerDist.MillerDist(...)
), instead of getting a MillerDist
instance back, I get a FreeRV
instance back, which seems to be created by distributions.Dist.Var()
.
I may have missed instructions somewhere. Can someone either tell me what I’m doing wrong here, or point me at some documentation describing how to create a new random variable class?
I found this page on probability distributions and this github ticket, but neither of them explains how to make a new distribution class (indeed, AFAICT the latter replaces an attempt to do so with code that just directly instantiates a DensityDist
).
Thanks!
You wont get a distribution instance when you are initialising it in a model context. What PyMC3 does within the pm.Model()...
is to create a theano tensor with the name
as a FreeRV
, and add logp (as a constraint for the tensor FreeRV
) to the total model logp function.
So as long as you implemented the logp of your MillerDist correctly, your model will behave as you implemented (i.e., a variable following a prior distribution as specified by MillerDist). Dont worry that it does not return a distribution instance - it is not supposed to.
OK, that helps. Here are two follow up questions, if you would be so kind:
-
My logp
method uses slots on the distribution object I’m creating (for the MillerDist
, parameters/slots, P,Q,R
, and S
. But as far as I can tell, since those are put into the object by my __init__
method, the resulting FreeRV
instance cannot see them. This seems like almost a FAQ – when we add properties to a custom random variable, where do we put them and find them? My colleague suggests using, for example, var.distribution.P
– is that right?
-
My MillerDist
class has a plot
method defined on it. Should I be turning this into a static method and then just invoking it as MillerDist.plot(var)
instead of as var.plot()
? Or should I be using var.distribution.plot()
?
-
(sorry, one more --) I defined MillerDistribution.logp()
. Seems like I need to reference it as var.distribution.logp
, correct?
Maybe it would be helpful to add some description of the protocol for adding a new kind of random variable… I’m willing to help with editing this page, if that’s the right place.
Thanks, as always!
-
I only had a quick glance of the Miller distribution, so I am not completely sure what is this slots you are referring to. If you meant that P, Q, R, S
are themselves distributions, you can pass a distribution to it, eg: pm.Normal.dist(mu=0., sd=1.,)
. Similar usage could be found in LKJCholeskyCov: http://docs.pymc.io/api/distributions/multivariate.html#pymc3.distributions.multivariate.LKJCholeskyCov
-
If you implemented a plot function, probably the best way to access it within a model context is var.distribution.plot()
.
-
Yes. And if you are outside of a model context you can do MillerDistribution.dist(...).logp()
. eg: pm.Normal.dist(0., 1.).logp(0.)
It would be great to extend that page, contribution welcome!
1 Like