Issue with custom likelihood

RichardJM · October 1, 2020, 11:20am

Hi,

Seems like a common sticking point, but I haven’t been able to find the help I need from past issues.

I am trying to define a custom likelihood and coming up against issues. The likelihood I would like to define is:

\ln\mathcal{L}=\nu \sum_{i=1}^n\left\{\frac{I_i}{S_i}+\ln S_j+(\frac{2}{\nu}-1)\ln I_j \right\}

where \nu is a constant, I_j is the data and S_j is the model of the data.

I have defined the likelihood as follows:

def likelihood(theta1, theta2, theta3, nu, x, y):
    '''
    theta1-3 parameters
    nu - constant
    x,y - paired data
    '''
    mod_ = theta1*tt.power(x,-theta2)+theta3 #defines model
    return -nu*((y/mod_)+tt.log(mod_)+(2/nu-1)*tt.log(y))

I have then instantiated a PyMC model:


with pm.Model() as model:
    
    #prior
    slope = pm.Normal('slope', 1, 1 )
    amp = pm.HalfNormal('amp', 1 )
    const = pm.Normal('const', 2e3, 100)

     #likelihood
    like = pm.DensityDist('like', likelihood, observed=dict(theta1=amp,theta2=slope,
                                                            theta3 = const,
                                                            nu=36,
                                                            x=x,
                                                            y=data_p))

and the MAP estimation functions works fine. However, when I try to sample, it appears to draw samples but crashes at the end. I get this error message:

MissingInputError: Input 0 of the graph (indices start from 0), used to compute Elemwise{exp,no_inplace}(amp_log__), was not provided and not given a value. Use the Theano flag exception_verbosity='high', for more information on this error.

Somewhere, the parameter amp_log__ is being created, and is even returned in the MAP estimate:

{'slope': array(1.67333194),
 'amp_log__': array(-0.96845745),
 'const': array(1280.07218614),
 'amp': array(0.37966824)}

Any ideas as to what I am doing wrong?

And, the summation present in the likelihood, do I need to specify in my definition of the likelihood function?

RichardJM · October 1, 2020, 1:38pm

Turns out a slight re-write of the function, and its subsequent call did the trick:

def likelihood(theta1, theta2, theta3, nu):
    '''
    theta1-3 parameters
    nu - constant
    '''
    def logp_(value):
        mod_ = theta1*tt.power(value[0],-theta2)+theta3
        return -nu*((value[1]/mod_)+tt.log(mod_)+(2/nu-1)*tt.log(value[1]))
    return logp_


with pm.Model() as model:
    
    #priors
    ....

    #likelihood
    like = pm.DensityDist('like', likelihood(amp, slope,const,2), observed=[x,test_data])

So model samples, but now I get divergences! One step forward and all that…

If anybody can shed light on why the previous code did not work, it would be appreciated.

junpenglao · October 3, 2020, 7:26am

Maybe a more informative prior for const would help?

RichardJM · October 5, 2020, 3:18pm

Hi,

Thanks for the suggestion. However, for the scale of the data that prior is reasonably informative (data values range from 1e2 to 1e6).

I found that the divergences actually came about because, while trying out different things to get the likelihood working, I’d switched the prior on the amplitude to a Normal (as opposed to the half Normal). So ruling out non-physical values probably stopped the leap-frog steps wandering in to a region the lead to divergences.

I am now encountering some odd behaviour when using the Arviz HDI routines. Apologise for conflating the two topics in this thread!!

I am using the same model as above (although somewhat modified), but I want a deterministic quantity in there. I will restate so nothing is ambiguous.

def pow_law_mod(x,p1,p2,p3):
    return p1*tt.power(x,-p2)+p3

def likelihood(model, theta1, theta2, theta3, nu):
    '''
    model:    - function
                functional form of curve
    theta1-3: - float
                parameters
    nu:       - int 
                constant
    '''
    def logp_(value):
        mod_ = model(value[0],theta1,theta2, theta3) 
        return -nu*((value[1]/mod_)+tt.log(mod_)+(2/nu-1)*tt.log(value[1]))
    return logp_


with pm.Model() as pow_law_v2:
    
    #prior
    slope = pm.Normal('slope', 1, 1 )
    amp = pm.HalfNormal('amp', 1 )
    const = pm.Normal('const', 1e3, 100)
    
    #likelihood
    like = pm.DensityDist('like', likelihood(pow_law_mod, amp, slope, const,2), observed=[x,test_data])
    mu = pm.Deterministic('mu', pow_law_mod(x, amp, slope, const))

    trace = pm.sample(1000, tune = 2500, target_accept=0.85)
    res = az.from_pymc3(trace)

The model samples fine, and posterior samples give good estimates for actual parameter values.

I am then using the Arviz hdi and plot_hdi functions to generate a HDI for mu. However, when plotting:

az.plot_hdi(x,hdi_data=hdi_data['mu'], ax=ax)

gives some odd curve:

Basically it’s the strange dip around 0.002. This curve is the same when using the trace of mu in plot_hdi instead of pre-computing the HDI data.

I checked to see whether it was something to do with the samples from the posterior, by re-creating the posterior curves using the model parameters (i.e. amp, slope and const); which give the grey lines. Also the curves for mu from the trace give the same results.

Finally (and I just checked this before hitting submit), if I plot the upper and lower HDI curves from az.hdi manually:

plt.plot(x,hdi_data['mu'][:][:,0])
plt.plot(x,hdi_data['mu'][:][:,1])

They are also fine.

Am I missing something with the plot_hdi function?

Topic		Replies	Views
Custom likelihood function Questions	3	665	August 28, 2018
Problem model definition PyMC3 Questions	12	4712	September 21, 2017
Custom likelihood function giving TypeError Questions	2	390	February 4, 2021
Custom likelihood computation with no direct model-data comparison version agnostic	4	722	February 8, 2023
Likelihood evaluation and DensityDist Questions	2	1695	February 21, 2018

Issue with custom likelihood

Related topics