DensityDist from scipy.stats distribution

Hello all,

I am having a heck of a time figuring out how to use DensityDist. In this case I am looking to use a left-skewed Gumbel and that is implemented in scipy which gives the logp for you which seems like a good thing to not have to get right. Here is my most recent attempt, I have tried several combinations of Theano stuff (that I seem to have a mental block against understanding) but am not sure the issue lies there. The observed values are just a few representative numbers, I have a large data set.

import numpy as np
import pymc3 as pm
import scipy
import scipy.stats
import theano.tensor as tt
from theano.compile.ops import as_op
with pm.Model() as model_l:
    @as_op(itypes=[tt.dvector], otypes=[tt.dscalar])
    def logp_(value):
        return scipy.stats.gumbel_l(loc=loc, scale=scale).logpdf(value)

    loc = pm.Uniform('loc', -100, 100)
    scale = pm.Uniform('scale', 0, 100)
    gumbel_l = pm.DensityDist('gumbel_l', logp_,  
                              observed=[0, -5, -10, -20])
    trace1970 = pm.sample(1000,  njobs=4)

pm.traceplot(trace1970, combined=True)
pm.summary(trace1970)

Which gives:
ValueError: setting an array element with a sequence.

I suppose this is from scipy.stats.gumbel_l.logpdf() not liking vector input.

So I tried wrapping this in a list comprehension:

with pm.Model() as model_l:
    @as_op(itypes=[tt.dvector], otypes=[tt.dvector])
    def logp_(value):
        return np.asarray([scipy.stats.gumbel_l(loc=l, scale=s).logpdf(value) 
                           for l, s in zip(loc, scale)])
            

    loc = pm.Uniform('loc', -100, 100)
    scale = pm.Uniform('scale', 0, 100)
    gumbel_l = pm.DensityDist('gumbel_l', logp_,  
                              observed=[0, -5, -10, -20])
    trace1970 = pm.sample(1000,  njobs=4)

pm.traceplot(trace1970, combined=True)
pm.summary(trace1970)

Which gives:
TypeError: TensorType does not support iteration. Maybe you are using builtin.sum instead of theano.tensor.sum? (Maybe .max?)

This whole effort seems like something that people might want to do commonly as scipy.stats has a ton of distributions. However I could not find an internet example.

Anyone have any tips/examples here?

Wrapping scipy/numpy function (the logp in this case) is quite slow. The recommended approach is to rewrite it into a theano function.
Looking at the scipy implementation it seems not too difficult?

Thanks!
That is certainly an interesting possible way. I gave this a try and at least got the model to run but I must be doing something wrong as the fit is really poor and scipy was able to get a good fit.

Here is my pymc3 part:

def logp(value):
     return value - pm.math.exp(value)

with pm.Model() as model:
    scale = pm.HalfNormal('scale', sd=10)
    loc = pm.Uniform('loc', -100, 100)
    y = (dst.loc['2000'].values - loc) / scale
    gumbel_l = pm.DensityDist('gumbel_l', logp, testval=0, observed=y)
    trace = pm.sample(2000, )

Dealing with the loc and scale is a bit funny, is this done correctly? (See docs) I think I do have this wrong but getting errors other ways, see the notebook below.

The entire working example is at:

You need to also account for the volume changes in the transformation of variables. More information see: https://en.wikipedia.org/wiki/Jacobian_matrix_and_determinant

Something like this should work:

def logp(value, loc, scale):
    value_ = (value - loc)/scale
    logp = value_ - pm.math.exp(value_)
    return logp - pm.math.log(scale)

with pm.Model() as model:
    scale = pm.HalfNormal('scale', sd=10, testval=10)
    loc = pm.Uniform('loc', -100, 100, testval=0)
    gumbel_l = pm.DensityDist('gumbel_l', 
                              logp, 
                              observed=dict(value=dst.loc['2000'].values, loc=loc, scale=scale))
    trace = pm.sample(2000)
1 Like

Perfect, thanks a lot of the help this will get me going for most any dist I might want to try. Let me ask a couple of more documentation related questions then get off the stage.

  • I have seen the observed=dict() before in examples but not a good description in the docs, did I miss it or is this on the wish list?
  • What is the pymc3 philosophy on having more distributions built in? There are a ton in scipy and is it working toward a larger list of developers need them or trying not to clutter?
  • Is there a diagnostic and good description of what testval does, is for, and how to test if you chose a decent one?
  • Would there be interest in an example like mine (less the big data file) in the repo for the next person? Best way a pull request?

Cheers

  • the feeded dictionary to DensitiyDist is all the input to the custom logp function, not just the observation (it’s the observation + parameters). This is not explained in the doc and cause some confusion from time to time - I think it would be great to add to the doc (http://docs.pymc.io/prob_dists.html#custom-distributions) want to send a Pull request?
  • basically if there is in scipy we should have it in PyMC3 - since all the test are scipy based, and if the distribution is in scipy that means there are enough interested and the computation is pretty stable.
  • usually you dont need to specify the testval, it is more for debugging purpose. In the previous releases we have cases that you need to set a good starting value via testval to help the model converge to the typical set, but with the recent tuning this should not be a big problem any more.
  • this one seems to be a pretty basic usage, so it would be redundant with the doc we already have - if you can extend it into a case study it would be a great contribution :wink: