Initial values not being used when sampling

Hello,

I have a model with 4 priors and a custom likelihood. Everything works fine when running the sample method with its default values and 1 chain. However I need to change the initial values used by the algorithm as they come from a research work that I am trying to replicate in PyMC.

I have tried passing the initvalue in the distribution directly or as a dictionary in the sample method but they are not being used (the ouput is always the same not matter the initial value I pass and I can see from the trace for each parameter that it doesn’t start there).

Could you help me on this ? Thank you so much,

My code :

import math
from scipy.special import gammaln, betaln,loggamma, gamma
import numpy as np
import aesara.tensor as at
import arviz as az

N11 = N11.astype(np.float64)
E11 = E11.astype(np.float64)


    
with pm.Model() as model2:
    
    def likelihood(theta,E11,N11):
        Mix_p,alpha1,alpha2,beta1,beta2 = theta
        
        const1=np.log(at.gamma(alpha1+N11)) - np.log(at.gamma(alpha1)) - np.log(at.gamma(N11+1))
        const2=np.log(at.gamma(alpha2+N11)) - np.log(at.gamma(alpha2)) - np.log(at.gamma(N11+1))
        LogL1=const1 - N11*np.log(1+beta1/E11) - alpha1*np.log(1+E11/beta1)
        LogL2=const2 - N11*np.log(1+beta2/E11) - alpha2*np.log(1+E11/beta2)
 
        return(np.log(Mix_p*np.exp(LogL1) + (1-Mix_p)*np.exp(LogL2)))

    
    
    
    alpha1 = pm.Gamma('alpha1', alpha=1, beta=1)
    beta1 = pm.Gamma('beta1', alpha=1, beta=1)
    
    alpha2 = pm.Gamma('alpha2', alpha=1, beta=1)
    beta2 = pm.Gamma('beta2', alpha=1, beta=1)
    
    Mix_p =pm.Uniform('Mix_p',0,1,initval=0.333)
    
    
    theta = (Mix_p,alpha1,alpha2,beta1,beta2)

    like = pm.Potential('likelihood',likelihood(theta,E11,N11))


#initvals = {'Mix_p': 0.333, 'alpha1':0.2,'alpha2': 2, 'beta1': 0.1, 'beta2': 4}    
with model2:
    trace = pm.sample(chains=1,initvals=initvals)
    pm.plot_posterior(trace)
    
az.plot_trace(trace)

You’re probably not looking at the beginning of the MCMC chains, because tuning iterations are discarded by default.
Try this:

with pmodel:
    idata = pm.sample(discard_tuned_samples=False) 
arviz.plot_trace(idata.warmup_posterior)

Thank you for your answer !

I think I am struggling a little bit to understand how to properly initialize. If I use the following configuration :

    alpha1 = pm.Gamma('alpha1', alpha=1, beta=1,initval=0.2)
    beta1 = pm.Gamma('beta1', alpha=1, beta=1,initval=2.0)
    
    alpha2 = pm.Gamma('alpha2', alpha=1, beta=1,initval=0.1)
    beta2 = pm.Gamma('beta2', alpha=1, beta=1,initval=4.0)
    
    Mix_p =pm.Uniform('Mix_p',0,1,initval=0.333)

I get the following when looking at the model2.initial_point ()

{'alpha1_log__': array(-1.60943791),
 'beta1_log__': array(0.69314718),
 'alpha2_log__': array(-2.30258509),
 'beta2_log__': array(1.38629436),
 'Mix_p_interval__': array(-0.69464756)}

and the following error :

SamplingError: Initial evaluation of model at starting point failed!
Starting values:
{'alpha1_log__': array(1.21993093), 'beta1_log__': array(1.43136018), 'alpha2_log__': array(1.23119309), 'beta2_log__': array(1.785199), 'Mix_p_interval__': array(-0.22687825)}

Initial evaluation results:
{'alpha1': -2.17, 'beta1': -2.75, 'alpha2': -2.19, 'beta2': -4.18, 'Mix_p': -1.4, 'likelihood': nan}

I am not sure why all the initial evaluation results come out negative, which then cause some issue…

I am also not very familiar with PyMC so I may be doing something wrong

The negative values there are probably just the log-priors of your free parameters.

But the likelihood function from this parameter vector results in a nan value.
Try to stick these numbers into the likelihood directly. You can do this by doing likelihood([0.333, 0.2, 2, 0.1, 4]).eval() which should give nan and then do the same but .eval() the intermediate variables from your likelihood calculation until you find the one that’s broken.

Hello,

So I tried to evaluate the likelihood at these points and it works fine :

likelihood([0.333, 0.2, 2, 0.1, 4],E11,N11).eval()
array([-4.85458219, -5.14259333, -5.75523372, ..., -6.2746346 ,
       -5.68955898, -6.87348596])

What I don’t understand is that whether I use initval in each prior like this :

    alpha1 = pm.Gamma('alpha1', alpha=1, beta=1,initval=0.2)
    beta1 = pm.Gamma('beta1', alpha=1, beta=1,initval=2.0)
    alpha2 = pm.Gamma('alpha2', alpha=1, beta=1,initval=0.1)
    beta2 = pm.Gamma('beta2', alpha=1, beta=1,initval=4.0)
    Mix_p =pm.Uniform('Mix_p',0,1,initval=0.333)

or with the start parameter of the sample() :

with model2:
    trace = pm.sample(chains=1,discard_tuned_samples=False,start={'alpha1': 0.2, 'beta1': 2.0, 'alpha2': 0.1, 'beta2': 4.0, 'Mix_p': 0.333})

I get an error with initial evaluation results that look nothing like what I chose (without even looking at the log-priors). I read some posts about transformed and untransformed things in the search space but not sure how to apply it here…

SamplingError: Initial evaluation of model at starting point failed!
Starting values:
{'alpha1_log__': array(-1.57016103), 'beta1_log__': array(1.36793787), 'alpha2_log__': array(-3.18984774), 'beta2_log__': array(0.54012241), 'Mix_p_interval__': array(-1.29184623)}

Initial evaluation results:
{'alpha1': -1.78, 'beta1': -2.56, 'alpha2': -3.23, 'beta2': -1.18, 'Mix_p': -1.78, 'likelihood': nan}

I tested the model and likelihood with a different dataset and it worked fine so I seems to be a matter of initializing properly…(?)

Thank you for your precious help,
Victor

From the documentation for the initvals argument passed to pm.sample():

Initialization methods for NUTS (see init keyword) can overwrite the default.

I tried other initialization method but it always results in an initial evaluation results composed of negatives values which causes the likelihood to crash…

Initial evaluation results:
{'alpha1': -1.33, 'beta1': -1.01, 'alpha2': -2.15, 'beta2': -6.45, 'Mix_p': -1.41, 'likelihood': nan}

I didn’t find any way to end up with something like this as passed in the priors initval (or anywhere else that’s more suitable) that would compute a correct likelihood :
Initial evaluation results:
{‘alpha1’: 0.2, ‘beta1’: 2.0, ‘alpha2’: 0.1, ‘beta2’: 4.0, ‘Mix_p’: 0.333, ‘likelihood’: xxx}

The problem was indeed coming from the likelihood. The gamma function was being evaluated with too high values returning inf. I switched back to the gammaln directly instead of taking the log of the function and sampling run.

I am not sure the algorithm still uses the initial values as I can’t see it from the trace (even after enabling the parameter mentioned above). But at least it runs

Thank you for your help