Inversion of parameter given forward simulation function

I have a forward simulation function forward_model that gives some parameters of type ndarray input and returns a result of type ndarray. I want to estimate the parameter epsilon_r given the synthetic data. Below is my script, but I don’t know how to include the function in MCMC model part. The problem seems that epsilon_r is a Tensorvariable and the simulation function cannot work with it. How to solve it? Thank you!

with pm.Model() as model:

    epsilon_r = pm.Uniform('epsilon_r', lower=1.0, upper=15.0, shape=true_epsilon_r.shape)

    sigma = pm.HalfNormal('sigma', sigma=1.0)

    mu = pm.Deterministic('mu', forward_model(epsilon_r, true_loss_tangent, true_thickness))

    likelihood = pm.Normal('likelihood', mu=mu, sigma=sigma, observed=synthetic_data)

    trace = pm.sample(10000, tune=1000, return_inferencedata=True)

Is it a simulation or a deterministic function?

I am sorry I am new to Pymc, but the forward_model is a deterministic function; it will return the same results regardless of when that same input is entered.

So ideally you would rewrite it with PyTensor operators so that you can get gradients for free and use NUTS for sampling (and possibly higher performance).

Mostly you can replace np.foo(a, b) by pytensor.tensor.foo(a, b), with special handling for conditionals/ loops and variable assignment: Tutorial — PyTensor dev documentation

If there’s no way of you recreating the function with PyTensor operations you can wrap it in a PyTensor Op as a blackbox function: Using a “black box” likelihood function — PyMC example gallery

1 Like

Thank you very much for your advice. I still have some questions. It looks like I need to use PyTensor Op, but I’m not sure if I have to define my own log-likelihood function. Can I just input the forward function and observations without defining log-likelihood function?

Yes, you can define an Op to just wrap your forward function, not just a likelihood

Thank you. I still run into an error. Here is my code:


radarspy_path = r"I:\projects\RadSPy-1DRadarSimulator"
sys.path.append(radarspy_path)

from RadSPy_op import simulator_1D

def my_model(epsilon_r, loss_tangent, thicknesses, dB_noise)
     ...
     The simulation function use the parameters given and returns an ndarray
     ...
    return np.real(result[start_index:end_index, 1] )

def my_loglike(epsilon_r, loss_tangent, thicknesses, dB_noise, data):
    for param in (epsilon_r, loss_tangent, thicknesses, dB_noise, data):
        if not isinstance(param, (float, np.ndarray)):
            raise TypeError(f"Invalid input type to loglike: {type(param)}")
    model = my_model(epsilon_r, loss_tangent, thicknesses, dB_noise)
    return -0.5 * ((data - model) / np.abs(dB_noise)) ** 2 - np.log(np.sqrt(2 * np.pi)) - np.log(np.abs(dB_noise))

class LogLike(Op):
    def make_node(self, epsilon_r, loss_tangent, thicknesses, dB_noise, data) -> Apply:
        # Convert inputs to tensor variables
        epsilon_r = pt.as_tensor(epsilon_r)
        loss_tangent = pt.as_tensor(loss_tangent)
        thicknesses = pt.as_tensor(thicknesses)
        dB_noise = pt.as_tensor(dB_noise)
        data = pt.as_tensor(data)

        inputs = [epsilon_r, loss_tangent, thicknesses, dB_noise, data]
        outputs = [data.type()]

        return Apply(self, inputs, outputs)

    def perform(self, node: Apply, inputs: list[np.ndarray], outputs: list[list[None]]) -> None:
        epsilon_r, loss_tangent, thicknesses, dB_noise, data = inputs  # this will contain my variables
        loglike_eval = my_loglike(epsilon_r, loss_tangent, thicknesses, dB_noise, data)
        outputs[0][0] = np.asarray(loglike_eval)
# set up our data
true_epsilon_r = np.array([8.8, 5.0, 7.0])
true_loss_tangent = np.array([0.01, 0.02, 0.03])
true_thickness = np.array([30, 40])

dBnoise = -40
synthetic_data = my_model(true_epsilon_r, true_loss_tangent, true_thickness, dBnoise)
# create our Op
loglike_op = LogLike()
test_out = loglike_op(true_epsilon_r, true_loss_tangent, true_thickness, -100, synthetic_data)

def custom_dist_loglike(data, true_epsilon_r, true_loss_tangent, true_thickness, dB_noise):
    # data, or observed is always passed as the first input of CustomDist
    return loglike_op(true_epsilon_r, true_loss_tangent, true_thickness, dB_noise, data)

# use PyMC to sampler from log-likelihood
with pm.Model() as no_grad_model:
    # uniform priors on epsilon_r
    eps = pm.Uniform("eps", lower=3.0, upper=10.0, shape=3)
    # use a CustomDist with a custom logp function
    likelihood = pm.CustomDist("likelihood", eps, true_loss_tangent, true_thickness, dBnoise, observed=synthetic_data, logp=custom_dist_loglike)
    trace = pm.sample(5000, tune=1000, progressbar=True, return_inferencedata=True)

The error reads in likelihood = pm.CustomDist(“likelihood”, eps, true_loss_tangent, true_thickness, dBnoise, observed=synthetic_data, logp=custom_dist_loglike)
ValueError: Could not broadcast dimensions. Incompatible shapes were [(ScalarConstant(ScalarType(int64), data=3),), (ScalarConstant(ScalarType(int64), data=3),), (ScalarConstant(ScalarType(int64), data=2),), (ScalarConstant(ScalarType(int64), data=1),)]

It seems that the CustomDist function cannot handle the input parameter correctly, as the example given in PyTensor Op without gradients Model definition Using a “black box” likelihood function — PyMC example gallery

You don’t need a CustomDist, just an Op to wrap around forward_model, so your case is just a subset of that blackbox likelihood tutorial

Still the simplest solution is usually to just use pytensor builtin Ops. How does your forward function look like actually?

Thank you, I know that maybe rewrite the forward function into Pytensor type may get enhanced efficiency. But the class is too complicated for me to rewrite. I would appreciate it if you gave me some advice. I have uploaded the forward simulation script for you.
RadSPy_op.py (5.5 KB)

Hi, I have rewritten the code, it is almost there, but it came across an error.

class Simulator1DOp(Op):
    # __props__ = ('H', 'windowNum', 'fmin', 'fmax', 'pulse', 't', 'matchFilter', 'f', 'tmf')

    def __init__(self, H, windowNum, fmin, fmax, pulse, t, matchFilter, f, tmf):
       ...

    def make_node(self, eps, lossTangent, thicknesses):
        eps = pt.as_tensor_variable(eps)
        lossTangent = pt.as_tensor_variable(lossTangent)
        thicknesses = pt.as_tensor_variable(thicknesses)
        return Apply(self, [eps, lossTangent, thicknesses], [pt.matrix()])

    def perform(self, node, inputs, outputs):
        eps, lossTangent, thicknesses = inputs
        result = self.run_sim(eps, lossTangent, thicknesses)
        outputs = np.asarray(result)

    def run_sim(self, eps, lossTangent, thicknesses):  # the main simulation function
       ...


with pm.Model() as model:
    eps = pm.Uniform('eps', lower=3.0, upper=10.0, shape=3)

    simulated_data = simulator_op(eps, true_loss_tangent, true_thicknesses)

    sigma = pm.HalfNormal('sigma', sigma=1.0)

    pm.Normal('obs', mu=simulated_data, sigma=sigma, observed=synthetic_data)

    trace = pm.sample(5000, tune=1000, progressbar=True, return_inferencedata=True)

The error occurred as:

The tested instance of Simulator1DOp cliass is
Simulator1DOp [id A] <Matrix(float64, shape=(?, ?))>
├─ [3.8 8. 3. ] [id B] <Vector(float64, shape=(3,))>
├─ [0.0001 0. … 02 0.0003] [id C] <Vector(float64, shape=(3,))>
└─ [30 40] [id D] <Vector(int32, shape=(2,))>

Could you give me some advice? Thank you very much!

It’s telling you that your observations shape are not compatible with the parameters mu or sigma, that are likely larger

Thank you very much. I have solved the problem with your suggestions.