Blackbox likelihood: Creating an Inference Data object and “Initial evaluation of model at starting point failed”

I’m fitting a reinforcement learning model to human behavior and would like to estimate marginal likelihoods and then Bayes factors for competing model. I’m able to compute the posterior distributions of model parameters using:

with pm.Model() as mod:
	alpha1 = pm.Beta('alpha1', 1.1, 1.1)
	alpha2 = pm.Beta('alpha2', 1.1, 1.1)
	Lambda = pm.Beta('Lambda', 1.1, 1.1)
	w = pm.Beta('w', 1.1, 1.1)
	beta1 = pm.Gamma('beta1', 1.2, 1/5)
	beta2 = pm.Gamma('beta2', 1.2, 1/5)
	params = [alpha1, alpha2, Lambda, w, beta1, beta2]
	like = pm.Potential('like', likelihood(params, stateActions1, stateActions2, rewards)) 
	tr = pm.sample(return_inferencedata=True)

But I get this error when trying to use arviz:

az.loo(tr, mod)

*** TypeError: log likelihood not found in inference data object

It has been noted elsewhere (see https://discourse.pymc.io/t/log-likelihood-not-found-in-inferencedata/8169) that returning the arviz InferenceData object is not possible when using a blackbox likelihood function with pm.Potential, because the conversion requires an “observed=” argument. So I was able to replace the like=pm.Potential… line of code with this:

prob_action = likelihood(params, stateActions1, stateActions2, rewards)
nr_actions = prob_action.shape[0]
stateActions1_binary = np.zeros((len(stateActions1),max(stateActions1)+1))
stateActions2_binary = np.zeros((len(stateActions2),max(stateActions2)+1))
stateActions1_binary[np.arange(len(stateActions1)),stateActions1] = 1
stateActions2_binary[np.arange(len(stateActions2)),stateActions2] = 1
like = pm.Multinomial('like', n=nr_actions, p=prob_action, observed=np.concatenate([stateActions1_binary[1:,:], stateActions2_binary[1:,:]]))

However, this leads to the following error:

pymc3.exceptions.SamplingError: Initial evaluation of model at starting point failed!
Starting values:
{‘alpha1_logodds__’: array(0.), ‘alpha2_logodds__’: array(0.), ‘Lambda_logodds__’: array(0.), ‘beta1_log__’: array(1.79175947), ‘beta2_log__’: array(1.79175947)}

Initial evaluation results:
alpha1_logodds__ -1.33
alpha2_logodds__ -1.33
Lambda_logodds__ -1.33
beta1_log__ -0.90
beta2_log__ -0.90
like -inf

Modifying the start values with the sample argument start={‘alpha1_logodds__’: np.array(0.5)…. does not improve the result, and I don’t know why my likelihood function would be producing -inf values.

The second model doesn’t seem equivalent to the initial one. Given the first one works correctly, it means that likelihood returns the log likelihood which can’t be used as a probability as is, so using it as p in the Multinomial ends up with nans or infs. You could exponentate it to make sure the probabilities are in the right 0-1 range (assuming the normalization factor is present) but even then, the model is different than the original one.

To get the log likelihood, you can use a densitydist instead of a potential (making sure all the inputs of observed are actual observed data, see Using a random variable as observed - #5 by OriolAbril, I think you’ll need a lambda on likelihood so the function passed to densitydist doesn’t have params as input).

Or you can also compute the log likelihood manually afterwards. See for example Refitting PyMC3 models with ArviZ (and xarray) — ArviZ dev documentation

To get the log likelihood, you can use a densitydist instead of a potential (making sure all the inputs of observed are actual observed data, see Using a random variable as observed - #5 by OriolAbril, I think you’ll need a lambda on likelihood so the function passed to densitydist doesn’t have params as input).

If I understand your suggestion correctly, I’ve replaced the like=pm.Potential… line with:

def likelihood_wrap(params):
	return lambda stateActions1, stateActions2, rewards: likelihood(params, stateActions1, stateActions2, rewards)
likelihood_wrap_ = likelihood_wrap(params)
like = pm.DensityDist("like", likelihood_wrap_(stateActions1, stateActions2, rewards), 
			observed={"stateActions1": stateActions1, "stateActions2": stateActions2, "rewards": rewards})

But that returns the error:

TypeError: ‘TensorVariable’ object is not callable

I’ve also played around with using the theano @as_op wrapper to try to avoid that error. Can you be a little more specific about how I should call DensityDist in my case? Thank you!

The second model doesn’t seem equivalent to the initial one. Given the first one works correctly, it means that likelihood returns the log likelihood which can’t be used as a probability as is , so using it as p in the Multinomial ends up with nans or infs. You could exponentate it to make sure the probabilities are in the right 0-1 range (assuming the normalization factor is present) but even then, the model is different than the original one.

I forgot to mention that I do exponentiate the output of the likelihood function when using pm.Multinomial, however it sounds like using DensityDist is the way to go.

Unlike potential and as shown in the link in my previous answer, DensityDist doesn’t take a value but a callable. It then uses the values or dict passed as observed to call it.

You should therefore do something like

likelihood_wrap = lambda stateActions1, stateActions2, rewards: likelihood(params, stateActions1, stateActions2, rewards)
pm.DensityDist("like", likelihood_wrap, observed=...)

the dict as observed should already work.

I see. However, when calling DensityDist, I get the following error, which occurs when evaluating
stateActions1 = theano.shared(np.asarray(stateActions1, dtype='int16'))
in my likelihood function:

ValueError: setting an array element with a sequence.

So I tried moving things around, putting this part outside of likelihood, just before defining likelihood_wrap:

stateActions1 = theano.shared(np.asarray(stateActions1, dtype='int16'))
stateActions2 = theano.shared(np.asarray(stateActions2, dtype='int16'))
rewards = theano.shared(np.asarray(rewards, dtype='int16'))

but then I get

TypeError: Expected an integer

since I use stateActions for indexing. So it seems like at some point when calling DensityDist, the inputs are made into non-int theano variables. Can you think of a way to deal with this?

That is strange, what versions are you using?

As a workaround, if you only use stateActions for indexing, can avoid making them arguments of the lambda, like you do with params. Then use only rewards as observed (only one observed argument is necessary) which won’t need to be a dict then. Again, there are many ways to do exactly the same thing with a densitydist, as shown in Using a random variable as observed - #5 by OriolAbril

Python 3.9.7
pymc3 3.11.4
arviz 0.11.0
numpy 1.19.2

I got it to work by only using rewards as observed, as you suggested, and by defining the theano.shared variables outside of the model (it’s probably not worth trying to understand why that latter step helped, unless you have any guesses, but it works!). Thank you. Using the lambda wrapper with DensityDist was the main key here.

1 Like