# Multi-Output Forward Model

Hi there,

I’m looking for help defining an appropriate likelihood function, which accepts multiple inputs. I’m fairly new to pymc3 and Bayesian modelling in general, so please excuse any oversights on my end. Also, excuse some of the nomenclature; most of my exposure to Bayesian methods in practice has been through geophysical inversions literature.

I have a mass balance model, in my case a simple representation of the surface input and outputs for a glacier system, for which I need to invert for a set of unknown model parameters. The final result of the mass balance model Y is the sum of three components ( i.e. Y = X_1 + X_2 - X_3).

I initially attempted my inversion solely inverting for Y from observations Y_{\rm obs}. While I was able to reproduce Y_{\rm obs}, my posterior distributions were nonphysical for the climatic region I’m working in. I know this could be rectified by defining tighter priors, but many of these parameters have well define distributions from previous literature which are the most scientifically defensible.

As an alternative approach, in hopes of more physical results, I am now attempting the inversion for the individual components of the model X_1, X_2, X_3 based on observations X_{1_{\rm obs}}, X_{2_{\rm obs}}, X_{3_{\rm obs}}. Where I need help defining an appropriate likelihood function which accepts multiple inputs.

As an initial approach I’ve started with:

# Define Forward model (wrapped through theano)
with model:
R, A, M = PDD_forward.forward(z_obs, f_s_prior, C_prior, f_r_prior, grad_a, A_m_prior)
# net balance [m i.e. / yr ]
B = A + R - M

# Define likelihood
with model:
# Individual likelihood functions for each component
R_est = pm.Normal("R_est", mu=R, sigma=R_sigma, observed=R_obs)
A_est = pm.Normal("A_est", mu=A, sigma=A_sigma, observed=A_obs)
M_est = pm.Normal("M_est", mu=M, sigma=M_sigma, observed=M_obs)

potential = pm.Potential("obs", R_est.sum()*A_est.sum()*M_est.sum())


but I’m not sure how appropriate this is. I settled on this as my initial approach by modifying (and simplifying) the example shown here.

A little unclear whether this is the proper way to go about this. While looking through the examples on the pymc3 website, I noticed the Non-linear Differential Equations tutorial from the pymc3.ode example notebook is at least analogous to my problem, in so far as the forward model has multiple outputs. In that case there is a single likelihood function which takes in multidimensional inputs. Since the code is using the pymc3.ode API I haven’t been able to 1:1 translate over to my example. I’ve looked at the source code for pymc3.ode.DifferentialEquation to understand a bit more of what’s returned and see how I could try mimicking it. This seems somewhat complicated, so I’d thought it be best to reach out for help before diving to deep down the rabbit hole.

Thank you for taking to time to look at this. If there’s anything left out and can further clarify please let me know.

Best,
Andrew

1 Like

Welcome!

The idea of targeting multiple observations (or multiple types of observation) is totally reasonable. This came up recently in a different context. It sounds like you have observations of both x’s (predictors/features) and y’s? In such cases, people will sometimes use an error-in-variables approach, in which unobservable “latent” predictors are tied to their presumably noisy measurements via a measurement model and then the y’s are connected to the unobserved-but-inferred predictor values. This is close to what you might do though it’s too simple of be of any use on its own:

with model:
# latent x
x_mu = pm.Normal("x_mu", mu=0, sigma=1)
# observed x/measurement model
x_like  = pm.Normal("x_like", mu=mu, sigma=1, observed=x_obs)
# coefficient
beta  = pm.Normal("beta", mu=1, sigma=1)
# likelihood uses latent x, not observed
y_like = pm.Normal("y_like", mu=x_mu*beta, sigma=1, observed=y_obs)


The approach in the notebook you linked to, using pm.Potential() is different in that a potential allows you to define a custom logp. There there are fewer constraints on what you can(not) do and you can provide arbitrary conditional log likelihood values. It’s not clear that you need such flexibility in your case.

Does any of that help?

2 Likes

I think this approach is promising, but I should have provided a little bit more information about the forward model. All components of the model are a function of elevation z, and each of the component takes a set of unknown model parameters, so something like:

Y(z, \theta_1, \theta_2, \theta_3, \theta_4, \theta_5) = X_1(z, \theta_1, \theta_2) + X_2(z, \theta_3) - X_3(z, \theta_4, \theta_5) \, .

As described above I have observation Y_{\rm obs}, X_{1_{\rm obs}}, X_{2_{\rm obs}}, X_{3_{\rm obs}} which were produced from a much more complex physical model. I’m seeking to back out the \theta_1 - \theta_5 which best reproduce the observed values using our simpler forward model. I think the error-in-variables approach suggested would be great for X_1 since our simple model is unable reproduce the more complex X_{1_{\rm obs}}, but for the rest of the components I’m not sure this is the most direct approach.

For a bit more clarity and to provide a bit more detail, I’ve included the forward model code below:

def forward(self, z, f_snow, C, f_r, grad_a, A_mean):
f_ice  = C*f_snow

# temperature and PDDs calc
T      = self._air_temp(z)
PDDs   = tt.switch(tt.gt(T, self.T_m), T, 0.0).sum(axis=0)

# accumulation calc
A_days = tt.switch(tt.lt(T, self.T_rs), 1/365., 0.0).sum(axis=0)

# calculate local surface melt assuming f_m = f_snow
melt_local = PDDs * f_snow

# calculate refreezing
R = tt.minimum(f_r*A, melt_local)

# snow 2 melt ratio
r_s2m = tt.switch(tt.eq(melt_local, 0.0), 1.0, A/melt_local)

# nodally specific degree day factor
f_m = tt.switch(tt.ge(r_s2m, 1.), f_snow, f_ice - (f_ice - f_snow)*r_s2m)

# calculate surface melt [kg m^{-2} yr^{-1}] with f_m
M = f_m*PDDs

# Return individual components of the mass balance model in [m i.e. / y]
return A * (1/910), R * (1/910), M * (1/910)


You can see in the forward model code, that the X_1 is the only independent variable, the calculations of X_2 and X_3 depend on the value of X_1. This makes me inclined to think that multivariate approach is probably what I need.

I’ve also plotted the results from the inversion:

just to give a sense of what things look like.

The figure above was generated through my attempt to implement the error-in-variables approach you described, with the likelihoods define as:

# Define Forward model (wrapped through theano)
with model:
R, A, M = PDD_forward.forward(z_obs, f_s_prior, C_prior, f_r_prior, grad_a, A_m_prior)
# net balance [m i.e. / yr ]
B = A + R - M

with model:
# Individual likelihood functions for each component
R_like = pm.Normal("R_like", mu=R, sigma=R_sigma, observed=R_obs)
A_like = pm.Normal("A_like", mu=A, sigma=A_sigma, observed=A_obs)
M_like = pm.Normal("M_like", mu=M, sigma=M_sigma, observed=M_obs)
# likelihood uses latent x, not observed
B_like = pm.Normal("B_like", mu=A + R - M, sigma=B_sigma, observed=B_obs)


Thanks again for taking the time to look at this. Much appreciated!

1 Like