Hello everyone,
relatively new user here checking out the capabilities of PyMC and looking for further advice.
Currently I’m working with a simple model and trying to calibrate the parameters from data. This has worked fine as long as the parameters are deterministic at core (have no aleatoric uncertainty). However, I’m struggling with parameters that have a underlying distribution of interest.
Example - Deflection Model
To make things more clear, we define a basic model that has three inputs and calculates the maximum deflection of a beam with one fixed end and a force acting on the other:
f(\mathbf{x}, \theta)=\frac{x_1x_2^3}{3I\theta}
where x_1 is the force [kN], x_2 the length [m], \theta the Young’s Modulus [GPa] and I the second moment of area which is fixed at I=320mm^4.
To create the scenario, we assume that \mathbf{x} are observable inputs and we only have to infer \theta from data. Further, \theta is expected to have unknown unit-to-unit variation with \theta\sim\mathcal{N}(210, 7^2). This distribution is the quantity of interest and we want to estimate \mu_\theta and \sigma_\theta.
The model f(\mathbf{x}) is both the ground truth model and the computational model, so there is no model error provided that the calibration is conducted properly.
import numpy as np
import arviz as az
import pymc as pm
def f_deflection(force, length, e):
return (force * (length * 1000) ** 3) / (30 * e * 1000 * 320.)
To calibrate the model we create N samples from the following process:
y_{exp}=f(x)+\epsilon\qquad \epsilon\sim\mathcal{N}(0, 0.1^2)
RNG = np.random.default_rng(42)
bounds = [[1, 1.5], [0.5, 2.0]] # calibration domain
N= 5000 # number of experiments
exp_noise = 0.1 # std of experiment (known)
theta_gt_mu = 210 # mu of interest (unknown)
theta_gt_std = 7.0 # std of interest (unknown)
exp_theta = RNG.normal(theta_gt_mu , theta_gt_std , N)
x1 = RNG.uniform(1.0, 1.5, N) # force [kN]
x2 = RNG.uniform(0.5, 2, N) # length [m]
x_exp = np.stack([x1, x2], axis=1)
y_exp = f_deflection(x_exp[:, 0], x_exp[:, 1], exp_theta)
y_exp = y_exp + RNG.normal(size=N) * exp_noise
# here it could be possible that y_exp gets negative, however this
# shouldn't affect our example
The following pycm code is used to calibrate the model:
with pm.Model() as basic_model:
# Priors for unknown distribution parameters
mu_theta= pm.Normal("mu", mu=190.0, sigma=30)
sigma_theta = pm.HalfNormal("sigma", sigma=4)
theta = pm.Normal("theta", mu=mu_theta, sigma=sigma_theta)
# Expected value of outcome
mu_delta = f_deflection(x_exp[:, 0], x_exp[:, 1], theta)
# Likelihood
Y_obs = pm.Normal("Y_obs", mu=mu_delta, sigma=exp_noise, observed=y_exp)
with basic_model:
data = pm.sample(10000)
az.plot_trace(data, combined=True)
I would expect that with larger N the epistemic uncertainty would decrease, so that we would eventually obtain \mathbb{E}[\mu_\theta\vert\mathcal{D}]\approx210 and \mathbb{E}[\sigma_\theta\vert\mathcal{D}]\approx7 with low variance, where \mathcal{D} is the data used for calibration.
Marginal Posteriors
What we can see is that only \mu is quite well estimated. Additionally, the plots indicate that we get a lot of divergences.
So far I tested different prior distributions with no success. I suspect it’s a fundamental problem that I’ve overlooked so far.
Any further advice is very welcome.
Thanks in advance
Sven