Black box likelihood example

Hi all,

I have been attempting to get the (gradient-less) black-box likelihood function to work. I followed this example here Using a “black box” likelihood function (numpy) — PyMC example gallery, and can get this code to run, but obtain the following error when trying to extend it to my problem:

Traceback (most recent call last):
  File "/home/henney/.local/lib/python3.10/site-packages/pytensor/compile/function/types.py", line 970, in __call__
    self.vm()
ValueError: Not enough dimensions on input.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/henney/Documents/Oxford/General_electrochemistry/heuristics/combined_heuristics_pmc3.py", line 215, in <module>
    idata_mh = pm.sample(10000, tune=2000)
  File "/home/henney/.local/lib/python3.10/site-packages/pymc/sampling/mcmc.py", line 619, in sample
    model.check_start_vals(ip)
  File "/home/henney/.local/lib/python3.10/site-packages/pymc/model.py", line 1781, in check_start_vals
    initial_eval = self.point_logps(point=elem)
  File "/home/henney/.local/lib/python3.10/site-packages/pymc/model.py", line 1816, in point_logps
    self.compile_fn(factor_logps_fn)(point),
  File "/home/henney/.local/lib/python3.10/site-packages/pymc/pytensorf.py", line 764, in __call__
    return self.f(**state)
  File "/home/henney/.local/lib/python3.10/site-packages/pytensor/compile/function/types.py", line 983, in __call__
    raise_with_op(
  File "/home/henney/.local/lib/python3.10/site-packages/pytensor/link/utils.py", line 535, in raise_with_op
    raise exc_value.with_traceback(exc_trace)
  File "/home/henney/.local/lib/python3.10/site-packages/pytensor/compile/function/types.py", line 970, in __call__
    self.vm()
ValueError: Not enough dimensions on input.
Apply node that caused the error: Sum{acc_dtype=float64}(likelihood)
Toposort index: 16
Inputs types: [TensorType(float64, (?,))]
Inputs shapes: [()]
Inputs strides: [()]
Inputs values: [array(-966.97620582)]
Outputs clients: [['output']]

Backtrace when the node is created (use PyTensor flag traceback__limit=N to make it longer):
  File "/home/henney/Documents/Oxford/General_electrochemistry/heuristics/combined_heuristics_pmc3.py", line 215, in <module>
    idata_mh = pm.sample(10000, tune=2000)
  File "/home/henney/.local/lib/python3.10/site-packages/pymc/sampling/mcmc.py", line 619, in sample
    model.check_start_vals(ip)
  File "/home/henney/.local/lib/python3.10/site-packages/pymc/model.py", line 1781, in check_start_vals
    initial_eval = self.point_logps(point=elem)
  File "/home/henney/.local/lib/python3.10/site-packages/pymc/model.py", line 1811, in point_logps
    factor_logps_fn = [pt.sum(factor) for factor in self.logp(factors, sum=False)]
  File "/home/henney/.local/lib/python3.10/site-packages/pymc/model.py", line 1811, in <listcomp>
    factor_logps_fn = [pt.sum(factor) for factor in self.logp(factors, sum=False)]

I haven’t been able to recreate this error using in the simpler example code linked though. The only observation I have is that when I raise an error in the perform() function, the dimensions of the tensor are not empty, i.e.:

Toposort index: 15
Inputs types: [TensorType(float64, (14,))]
Inputs shapes: [(14,)]
Inputs strides: [(8,)]
Inputs values: ['not shown']
Outputs clients: [[Sum{acc_dtype=float64}(likelihood)]]

So I am mystified as to where this error is actually occurring. Any suggestions would be extremely welcome!

Are you following the example exactly (copy and paste style) or using it as a template?

I’ve hewed pretty closely to it.

Here’s my likelihood:


def custom_log_likelihood(parameters, data, sim_class):
    parameters=parameters[0]
    min_sigma=parameters[-1]
    trumpet_sigmas=parameters[-3:-1]
    eis_noise_params=parameters[-7:-3]
    simulation=sim_class.test_vals(parameters[:-7])
    EIS_likelihoods=[0,0]
    trumpet_likelihoods=[0,0]
    for i in range(0, 2):
        EIS_likelihoods[i]=multitiplicative_gaussian_likelihood(simulation[0][:,i], data[0][:,i], eis_noise_params[2*i], eis_noise_params[(2*i)+1])
        trumpet_likelihoods[i]=gaussian_log_likelihood(simulation[1][:,i], data[1][:,i],trumpet_sigmas[i])
    min_likelihood=[gaussian_log_likelihood(simulation[2], data[2], min_sigma)]
    all_likelihoods=EIS_likelihoods+trumpet_likelihoods+min_likelihood

    return np.array(np.sum(all_likelihoods))

The interfacing class

class LogLike(pt.Op):
    itypes = [pt.dvector]
    otypes = [pt.dvector]

    def __init__(self, data, func, sim_class):
        self.data = data
        self.func=func
        self.sim_class=sim_class

    def perform(self, node, inputs, outputs):
        logl=self.func(inputs, self.data, self.sim_class)
        outputs[0][0]=logl

        
c=LogLike(sim_class.total_likelihood, custom_log_likelihood,sim_class)  

and the sampling code

with pm.Model() as model:
    
    parameter_distributions=[pm.Uniform(x, lower=param_bounds[x][0], upper=param_bounds[x][1], initval=true_param_vals_dict[x]) for x in sim_class.common_optim_list]
    noise_distributions=[pm.Uniform(x, lower=0, upper=10, initval=1) for x in noise_dist_names]
    composed_dist=parameter_distributions+noise_distributions
    
    theta =pt.as_tensor_variable(composed_dist)
    
    pm.Potential("likelihood", c(theta))
    idata_mh = pm.sample(10000, tune=2000)

I take it sim_class.common_optim_list and noise_dist_names are lists of strings, and composed_dist is also a list of RV’s? It might be worth scattering a few print statements about to check everything is as it should be. I’ve had the same problem where
theta =pt.as_tensor_variable(composed_dist) does not send what I think it is to the likelihood function.

Also, when you do parameters = parameters[0] do you intend to overwright it before the rest of the assignments?

Thanks for the response! And yes, everything you’ve asked about is a list of strings. Inspecting the type() of composed_dist yields a list of pytensor.tensor.var.TensorVariable which is I presume is correct.

As to your second point, the parameters argument appears to be sent to the function as [np.array(values)] and so that line is to obtain just the numpy array

So the likelihood function expects a pytensor dvector of type float64 and i think if you’ve got [np.array(values)] to work with after theta has been passed to c() that might be you’re problem. I think you should have somethimg more like just [values] like in the example where the argument of pt.as_tensor_variable() is [m, c].

Also, if that is no help. I’ve just had another look at the way you’ve set up the class and I think there may be an issue there. You haven’t defined inputs in the perform functon.

class LogLike(pt.Op):
itypes = [pt.dvector]
otypes = [pt.dvector]

def __init__(self, data, func, sim_class):
    self.data = data
    self.func=func
    self.sim_class=sim_class

def perform(self, node, inputs, outputs):
    (parameters, ) = inputs # something like this I think
    logl=self.func(parameters, self.data, self.sim_class)
    outputs[0][0]=logl

c=LogLike(custom_log_likelihood, sim_class) # I think you only need ‘func’ and ‘sim_class’ when you create the Op initially

Not sure it’s helpful (or even correct), but I think that it should be otype=[pt.dscalar]? Because the sampler expects logp as a scalar?

I don’t like the itypes otypes thing so much. I suggest you implement make_node which is more flexible and easier to debug. It’s what we use in this blogpost: How to use JAX ODEs and Neural Networks in PyMC - PyMC Labs

Anyway your original error said that some Op was expecting a vector input but got a scalar during evaluation, which agrees with @cluhmann suggestion that you mispecified the otype

Inputs types: [TensorType(float64, (?,))]
Inputs shapes: [()]
Inputs strides: [()]
Inputs values: [array(-966.97620582)]

Aha, thank you very much all! The error is indeed setting otype to dvector. Changing to
otypes = [pt.dscalar]
fixes the issue.

Thanks again!

1 Like