How to: calibration -> inverse prediction

Omar_Damian_Piuselli · December 20, 2023, 1:20am

Hi everyone! I’m new to PyMC and have been doing a lot of reading and tutorials recently.
For my first real model with PyMC I’m looking to implement a multivariate linear regression.

The context is as follows:
I have 30 UV-vis spectra, each spectrum has about 130 signal values, one for each wavelength. These spectra are my calibration set, both the signals and the concentration of each of the nine chemical species present are known by design. sneak peek of the data.

As a chemist my typical workflow with a Frequentist model:

1- Infer from Y (signals) and X (concentrations) the parameters of the linear model, alpha (offset) and beta (sensibility coefficients). In this case for the multivariate approach, alpha would be a row vector and beta a matrix of coefficients. Y ~ A + B*X + epsilon
2- With the parameters, infer X from Y in new samples with unknown concentrations.

I have seen examples of multivariate linear models in PyMC, but not in this particular context.

In the calibration (step 1), both X and Y are known. In the Bayesian context, Y would be modeled as normally distributed and dependent on the value of X and the parameters, with X being treated as a fixed variable with no uncertainty.
But in the inverse prediction (step 2) I need to treat X as a random variable conditional on the value of Y for the unknown samples and the values of the parameters from step 1.

I feel that I can manage step 1, but I’m pretty clueless about how to achieve step 2 from there. I need to change the data Y and switch somehow X from a fixed variable to a random one and infer its distribution for each unknown sample.

If anyone has a similar example or can guide me about the best way to solve this problem I would greatly appreciated it!

ricardoV94 · December 20, 2023, 11:41am

You could have x be a partially observed RV with observed = [x1, x2, x3, nan, nan] and y a completely observed RV. Then the unobserved x will be inferred during mcmc sampling

@junpenglao has a nice talk showing an approach like this here: https://m.youtube.com/watch?v=nJ3XefApED0

Omar_Damian_Piuselli · December 20, 2023, 12:37pm

Thanks for the response! I will definitely check out the video.

However making X a random variable and mixing the calibration set with the unknown set it’s problematic.

During the calibration step X can be considered not random, it’s value is know with very very high precision since those concentrations are by design. In the prediction step it’s the opposite, Xpred it’s unknown and pretty much needs to be treated as random variable.

I would need a way to establish very strict priors on the values of X for the calibration data and very uninformative ones for the unknowns.

ricardoV94 · December 20, 2023, 12:55pm

The observed part of a partially observed variable has no uncertainty or randomness whatsoever. It’s exactly the same as if you passed x as an array of numbers. The unobserved part is the one that will actually be sampled/inferred.

ricardoV94 · December 20, 2023, 12:59pm

You can write an equivalent model that doesn’t use automatic imputation. Maybe that’s more understandable. Here I am using something simpler but you should be able to translate to your own model

with pm.Model() as m:
  x_fixed = np.array([1, 2, 3])
  x_unknown = pm.Normal("x_unknown", shape=2)
  x = pm.math.concatenate([x_fixed[None], x_unknown[None]], axis=0)
  y = pm.Normal("y", x, 0.1, observed=np.array([1, 2, 3, 4, 5]))

Omar_Damian_Piuselli · December 20, 2023, 4:38pm

Thank you! This seem like a promising approach! I will toy with this idea and update the thread if i hit another roadblock.

Cheers

Topic		Replies	Views
Sampling plausible posterior x parameter values in linear model for given y version agnostic	6	76	July 30, 2024
Making predictions on model inputs based on observed outputs Questions	3	738	March 18, 2021
Solving inverse problem for a Bayesian linear regression? version agnostic	5	182	December 25, 2024
Bayesian calibration on GP model v3 gaussian_process	7	1113	July 23, 2022
Problem using previous samples to new inference in new model context v5	11	141	July 4, 2024

How to: calibration -> inverse prediction

Related topics