Perform time-dependant model calibration

I’m trying to calibrate a time-dependant deterministic model following this dynamics: z_t | \theta_1, \theta_2, \theta_3 = f(z_{t-1}, (\theta_1, \theta_2, \theta_3)) where f is some deterministic (non linear) function and (\theta_1, \theta_2, \theta_3) are the parameters I want to calibrate using a bayesian approach.

The idea was to define the likelihood function like this : Y_t | (\theta_1, \theta_2, \theta_3) \sim N( z_t | (\theta_1, \theta_2, \theta_3) , \sigma) where Y_t is my observation at time t (seen as a RandomVariable distributed from the likelihood), z_t my model prediction and \sigma is the (fixed) error of measure.

The troubles come when I try to declare the likelihood. I compute the model predictions, store them in a numpy.array and give them to the pymc3.Normal() function.

obs = pm.Normal('observations', mu=simus, sigma=measure_error, observed=data)   # simus contains the predictions for each date at which a measure was done

it raises a

ValueError: setting an array element with a sequence.

I defined the same model using pymc2 and the inference worked well. Here is the code where I define the model (the inference was performed in an other file that imported the model):

# priors
theta1 = pymc.Beta('theta1', alpha=alpha_t1, beta=beta_t1)
theta2 = pymc.Beta('theta2', alpha=alpha_t2, beta=beta_t2)
theta3 = pymc.Normal('theta3', mu=mu_t3, tau=sig_t3)

# apply markov model using random variables as parameters for the dates of measures
@pymc.deterministic  # decorator allows to return a pymc.Deterministic object
def wrap_simu(th1=theta1, th2=theta2, th3=theta3):
    params = {'theta1': th1, 'theta2': th2, 'theta3': th3}
    simus = run_simu(params)  # returns an array containing the predictions
    return simus

# likelihood (observations are distributed from likelihood)
obs = pymc.Normal('obs', mu=wrap_simu, tau=measure_error, value=measures, observed=True)

I also tried to use a Deterministic variable with pymc3:

simus = pm.Deterministic('simus', run_simu(z0, final_t, dates, theta1, theta2, theta3))

but it raises:

TypeError: order not understood

I know that my pymc3 code fails because I don’t use proper Theano objects but I don’t know how to cast the numy.array containing my model predictions into a theano.vector (the theano.shared() did not work).

So, does anyone have an idea of how to proceed ?

Thank you very much.

Looking at the function arg, seems you need to do something like

simus = pm.Deterministic('simus', run_simu({'theta1': theta1, 'theta2': theta2, 'theta3': theta3}))

Thank you for your fast answer.

Sorry I did not put the entiere code in my first message. Here is my function run_simu(parameters)(I removed the unnecessary arguments z0, final_t, datesand give it a dictionnary of parameters as you suggested. In fact, I defined it totally independant from the pymc3 and Theano modules:

def run_simu(params):
    records = []
    z_t = z0
    for day in range(final_t):
        if day in dates:  # just to record the simulated values only when I have measures
        # my model equation
        z_t = z_t * params['theta2'] + np.sin(2 * np.pi * day / params['theta3']) + params['theta1']\
              * np.cos(2 * np.pi * z_t)
    return np.array(records)

When I call it now I get the same error as before.

If I don’t use the pm.Determinstic object and just call the function run_simu(parameters)troubles only come when I declare my likelihood function. I think the issue comes from the fact that this function is not consistent with Theano : the parameters are TransformedRV (\theta_1 and \theta_2) or FreeRV (\theta_3) objects, when I combine them in the loop I get Elemwise{add,no_inplace}.0objects that I store in an array which is returned. Maybe the mean of the Normal likelihood cannot be an array of Elemwise{add,no_inplace}.0, but then do you have an idea of what should be done to fix it ? Should I declare it with the theano.as_op decorator?

Thank you again.

In general, a for loop will not perform very well in theano/pymc3. Here your first step should be removing all np* operation with theano.tensor.* operation. I suggest you to first have a look at

1 Like

run_simu expects numpy or python values for theta2 and theta3, but it gets symbolic theano variables. So your loop doesn’t work as it should. If final_t is small, you could just use a loop anyway, and build up a large theano graph that contains an unrolled version of that loop. So something like this:

    records = []
    z_t = z0
    for day in range(final_t):
        z_t = z_t * params['theta2'] + np.sin(2 * np.pi * day / params['theta3']) + params['theta1']\
              * np.cos(2 * np.pi * z_t)
    return tt.concatenate(records)[dates]

The graph size will grow linearly with final_t however. So if that is large you need to explain the loop itself to theano. theano.scan does that:

You could also use theano.as_op, but unless you want to implement the gradients on your own, most samplers won’t work with that.

This might help you to understand theano a bit better: