Problem with pymc3.DensityDist() #2846

from_github

#1

Hello everybody,

In the last days I have been working with PyMC3. I am new using it and I am interested in make my own model for parameter estimation. Basically, my problem is that I am interested in using a different Likelihood that the ones I have seen PyMC3 has implemented. The Likelihood that I am interested to use is given in “arXiv:astro-ph/0203259”, eq. (35) (see the image).

likelihood

Here k is the k-dataset at our dispossal, n_k is the number of data inside a k-dataset, $\chi^2_k$ is the chi-square for the k-dataset.

I think I know how to add this kind of likelihoods, it is via pm.DensityDist(). I have been trying to programing it, but I have been having some problems. I am trying to fit a straight line for 3 datasets, first i did it with the typical approximation, which is with a Gaussian-like likelihood, but now I want to use the hyperparameter formalism. Could someone tell my what is the problem in my code? The code basically is:

    import theano.tensor as T
    import theano as TT
    
    def ymodel(a,b,x):
        return a+b*x
         
    def chi2(a,b,x,data,sigma2,n):
        D = ymodel(a,b,x)-data
        sigma2_mat = T.nlinalg.AllocDiag()(sigma2*T.ones(n))
        return T.dot(D,T.dot(sigma2_mat,D))
         
    def log_Like1D(a,b,x1,y1,sigma2,n1):
        N1 = 1*n1/2+1
        return T.log(T.gamma(N1))+N1*(T.log(2)-T.log(chi2(a,b,x1,y1,sigma2,n1)+2))
    
    with pm.Model() as model1:
        # Priors for unknown model parameters
        a = pm.Uniform('a', lower=-5., upper=5.)
        b = pm.Uniform('b', lower=0., upper=4.)
        sigma2 = pm.Uniform('sigma', lower = 0.001, upper = 0.5)
        Lik = pm.DensityDist('Lik',lambda value: log_Like1D(a,b,x1,y1,sigma2,size)+log_Like1D(a,b,x2,y2,sigma2,size)+log_Like1D(a,b,x3,y3,sigma2,size))

But when I do:

    niter=5000
    with model1:
        start = pm.find_MAP() #Compute a maximum a posteriori estimates 
        step = pm.Metropolis()
        db = SQLite('H13Dataprue.db')
        trace = pm.sample(niter, trace=db, step=step, start=start,njobs=5,temp = 2,thin=50, random_seed=123)

I have the following error:

ValueError Traceback (most recent call last)
in ()
2 with model1:
3 start = pm.find_MAP() #Compute a maximum a posteriori estimates
----> 4 step = pm.Metropolis()
5 db = SQLite('H13Dataprue.db')
6 trace = pm.sample(niter, trace=db, step=step, start=start,njobs=5,temp = 2,thin=50, random_seed=123)

/home/enriques/Downloads/yes/lib/python2.7/site-packages/pymc3/step_methods/arraystep.pyc in new(cls, *args, **kwargs)
63 # If we don't return the instance we have to manually
64 # call init
---> 65 step.init([var], *args, **kwargs)
66 # Hack for creating the class correctly when unpickling.
67 step.__newargs = ([var], ) + args, kwargs

/home/enriques/Downloads/yes/lib/python2.7/site-packages/pymc3/step_methods/metropolis.pyc in init(self, vars, S, proposal_dist, scaling, tune, tune_interval, model, mode, **kwargs)
134
135 shared = pm.make_shared_replacements(vars, model)
--> 136 self.delta_logp = delta_logp(model.logpt, vars, shared)
137 super(Metropolis, self).init(vars, shared)
138

/home/enriques/Downloads/yes/lib/python2.7/site-packages/pymc3/step_methods/metropolis.pyc in delta_logp(logp, vars, shared)
606 inarray1 = tensor_type('inarray1')
607
--> 608 logp1 = pm.CallableTensor(logp0)(inarray1)
609
610 f = theano.function([inarray1, inarray0], logp1 - logp0)

/home/enriques/Downloads/yes/lib/python2.7/site-packages/pymc3/theanof.pyc in call(self, input)
270 input : TensorVariable
271 """
--> 272 oldinput, = inputvars(self.tensor)
273 return theano.clone(self.tensor, {oldinput: input}, strict=False)
274

ValueError: need more than 0 values to unpack

I suppose that it is because I never say to PyMC3 what are my observables. How could I say to PyMC3 which ones are my observanbles? or/and how could I fix this problem?

Thank you for your time,

LP


How to set up a custom likelihood function for two variables
#2

Yes you need to provide observation to pm.DensityDist, in short it should be a python dict with all the input to your custom logp function (log_Like1D), which mean in your case it would be dict(a=a, b=b...).
If you search here in the discourse you should find some similar post on the usage of pm.DensityDist.

You can also try using pm.Potential, which adds a likelihood term to your model:

Lik = pm.Potential('Lik', log_Like1D(a,b,x1,y1,sigma2,size)+
                          log_Like1D(a,b,x2,y2,sigma2,size)+
                          log_Like1D(a,b,x3,y3,sigma2,size))

It should work, but you might want to improve your likelihood function implementation a bit (using a more numerical stable log-gamma function intead of gamma, and use a log-chi2 etc)


#3

I tried with pm.Potential and it works. Thanks you very much!

So, pm.Potential() is only if what I want to add is a Likelihood? If I had complicated priors, should I use pm.DensityDist()?

Could you tell me to what numerical stable functions you refer? Sorry, I am very new using numerical codes and I don’t know very much about this.

Again, thank you very much for your help!

LP


#4

Using pm.DensityDist() is more flexible, as you can have density function not associate with an observed. But in this case, where it is observed, the effect is the same as using Potential, ie adding a log-likelihood to the model likelihood.

By numerical stable function, i meant you can replace T.log(T.gamma(N1)) with T.gammaln(N1).