Preamble: PyMC3 is legacy software that is not maintained. This reply assumes you go get the latest version, which is 5.16.2 at the time of writing.
You can add measurement error to covariates by assigning them distributions, and using the (noisy) data as observations of these distributions. Here’s an example:
import pymc as pm
import numpy as np
import arviz as az
with pm.Model() as error_model:
mu_x = pm.Normal('mu_x', sigma=100)
sigma_x = pm.Exponential('sigma_x', 0.1)
x = pm.Normal('x', mu=mu_x, sigma=sigma_x, size=(100,))
alpha = pm.Normal('alpha', sigma=100)
beta = pm.Normal('beta', sigma=10)
mu = pm.Deterministic('mu', alpha + beta * x)
sigma = pm.Exponential('sigma', 1)
obs = pm.Normal('obs', mu=mu, sigma=sigma)
Use this model to take some prior draws, and choose one to use as data, and modify the model to include the observations:
with error_model:
prior = pm.sample_prior_predictive()
draw_idx = np.random.choice(500)
true_data = prior.prior.isel(chain=0, draw=draw_idx)
model = pm.observe(error_model, {x:true_data.x, obs:true_data.obs})
model.to_graphviz()
Here’s what you end up with. Notice that uncertainty from x
flows down into obs
Sample the model and make some plots:
with model:
idata = pm.sample()
idata = pm.sample_posterior_predictive(idata, extend_inferencedata=True)
var_names = ['mu_x', 'sigma_x', 'alpha', 'beta']
ref_val = np.r_[[true_data[x].item() for x in var_names]].tolist()
az.plot_posterior(idata, var_names=var_names, ref_val=ref_val)
az.plot_ppc(idata, var_names=['obs', 'x'])
The posterior predictive draws for obs
are computed conditional on the draws of x
, combining uncertainty from both. You can verify this by comparing the outputs to a second model that directly uses the x
data:
with pm.Model() as fixed_x_model:
alpha = pm.Flat('alpha')
beta = pm.Flat('beta')
mu = pm.Deterministic('mu', alpha + beta * true_data.x.values)
sigma = pm.Flat('sigma')
obs = pm.Normal('obs', mu=mu, sigma=sigma, observed=true_data.obs.values)
fixed_x_idata = pm.sample_posterior_predictive(idata, extend_inferencedata=False, var_names=['mu', 'obs'])
Plot results:
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
az.plot_ppc(idata, var_names=['obs'], ax=ax, mean=False, observed=False, legend=False, alpha=0.1)
az.plot_ppc(fixed_x_idata, var_names=['obs'], ax=ax, color='tab:orange', mean=False, observed=False, legend=False, alpha=0.1)
Blue includes uncertainty on x, orange does not