Issue with custom likelihood

Hi,

Seems like a common sticking point, but I haven’t been able to find the help I need from past issues.

I am trying to define a custom likelihood and coming up against issues. The likelihood I would like to define is:

\ln\mathcal{L}=\nu \sum_{i=1}^n\left\{\frac{I_i}{S_i}+\ln S_j+(\frac{2}{\nu}-1)\ln I_j \right\}

where \nu is a constant, I_j is the data and S_j is the model of the data.

I have defined the likelihood as follows:

def likelihood(theta1, theta2, theta3, nu, x, y):
    '''
    theta1-3 parameters
    nu - constant
    x,y - paired data
    '''
    mod_ = theta1*tt.power(x,-theta2)+theta3 #defines model
    return -nu*((y/mod_)+tt.log(mod_)+(2/nu-1)*tt.log(y))

I have then instantiated a PyMC model:


with pm.Model() as model:
    
    #prior
    slope = pm.Normal('slope', 1, 1 )
    amp = pm.HalfNormal('amp', 1 )
    const = pm.Normal('const', 2e3, 100)

     #likelihood
    like = pm.DensityDist('like', likelihood, observed=dict(theta1=amp,theta2=slope,
                                                            theta3 = const,
                                                            nu=36,
                                                            x=x,
                                                            y=data_p))

and the MAP estimation functions works fine. However, when I try to sample, it appears to draw samples but crashes at the end. I get this error message:

MissingInputError: Input 0 of the graph (indices start from 0), used to compute Elemwise{exp,no_inplace}(amp_log__), was not provided and not given a value. Use the Theano flag exception_verbosity='high', for more information on this error.

Somewhere, the parameter amp_log__ is being created, and is even returned in the MAP estimate:

{'slope': array(1.67333194),
 'amp_log__': array(-0.96845745),
 'const': array(1280.07218614),
 'amp': array(0.37966824)}

Any ideas as to what I am doing wrong?

And, the summation present in the likelihood, do I need to specify in my definition of the likelihood function?

Turns out a slight re-write of the function, and its subsequent call did the trick:

def likelihood(theta1, theta2, theta3, nu):
    '''
    theta1-3 parameters
    nu - constant
    '''
    def logp_(value):
        mod_ = theta1*tt.power(value[0],-theta2)+theta3
        return -nu*((value[1]/mod_)+tt.log(mod_)+(2/nu-1)*tt.log(value[1]))
    return logp_


with pm.Model() as model:
    
    #priors
    ....

    #likelihood
    like = pm.DensityDist('like', likelihood(amp, slope,const,2), observed=[x,test_data])

So model samples, but now I get divergences! One step forward and all that…

If anybody can shed light on why the previous code did not work, it would be appreciated.

2 Likes

Maybe a more informative prior for const would help?

Hi,

Thanks for the suggestion. However, for the scale of the data that prior is reasonably informative (data values range from 1e2 to 1e6).

I found that the divergences actually came about because, while trying out different things to get the likelihood working, I’d switched the prior on the amplitude to a Normal (as opposed to the half Normal). So ruling out non-physical values probably stopped the leap-frog steps wandering in to a region the lead to divergences.

I am now encountering some odd behaviour when using the Arviz HDI routines. Apologise for conflating the two topics in this thread!!

I am using the same model as above (although somewhat modified), but I want a deterministic quantity in there. I will restate so nothing is ambiguous.

def pow_law_mod(x,p1,p2,p3):
    return p1*tt.power(x,-p2)+p3

def likelihood(model, theta1, theta2, theta3, nu):
    '''
    model:    - function
                functional form of curve
    theta1-3: - float
                parameters
    nu:       - int 
                constant
    '''
    def logp_(value):
        mod_ = model(value[0],theta1,theta2, theta3) 
        return -nu*((value[1]/mod_)+tt.log(mod_)+(2/nu-1)*tt.log(value[1]))
    return logp_


with pm.Model() as pow_law_v2:
    
    #prior
    slope = pm.Normal('slope', 1, 1 )
    amp = pm.HalfNormal('amp', 1 )
    const = pm.Normal('const', 1e3, 100)
    
    #likelihood
    like = pm.DensityDist('like', likelihood(pow_law_mod, amp, slope, const,2), observed=[x,test_data])
    mu = pm.Deterministic('mu', pow_law_mod(x, amp, slope, const))

    trace = pm.sample(1000, tune = 2500, target_accept=0.85)
    res = az.from_pymc3(trace)

The model samples fine, and posterior samples give good estimates for actual parameter values.

I am then using the Arviz hdi and plot_hdi functions to generate a HDI for mu. However, when plotting:

az.plot_hdi(x,hdi_data=hdi_data['mu'], ax=ax)

gives some odd curve:

Basically it’s the strange dip around 0.002. This curve is the same when using the trace of mu in plot_hdi instead of pre-computing the HDI data.

I checked to see whether it was something to do with the samples from the posterior, by re-creating the posterior curves using the model parameters (i.e. amp, slope and const); which give the grey lines. Also the curves for mu from the trace give the same results.

Finally (and I just checked this before hitting submit), if I plot the upper and lower HDI curves from az.hdi manually:

plt.plot(x,hdi_data['mu'][:][:,0])
plt.plot(x,hdi_data['mu'][:][:,1])

They are also fine.

Am I missing something with the plot_hdi function?