Rounding Error Model

I am working with sensor data that has limited precision which I believe to be important to incorporate into my model. However, I’m struggling with how to implement this. I feel like this is really obvious and I’m just missing something.

I found this section in the Stan user guide: Measurement Error and Meta-Analysis but I’m having difficulty converting it to PyMC. They have two versions; the one that I think is closest to PyMC is

# y = (10, 10, 12, 11, 9)

data {
  int<lower=0> N;
  vector[N] y;
}
parameters {
  real mu;
  real<lower=0> sigma;
  vector<lower=y-0.5, upper=y+0.5>[N] z;
}
model {
  sigma ~ normal(0, 1);
  z ~ normal(mu, sigma);
}

trying to convert this to PyMC, I was thinking something like this:

y = np.array([10, 10, 12, 11, 9])

with pm.Model() as rounding_error_model:

    mu = pm.Flat("mu")
    sigma = pm.HalfNormal('sigma', sigma=1)
    z = pm.Normal('z', mu=mu, sigma=sigma, shape=y.shape)

    lower = pm.math.floor(z)
    upper = pm.math.ceil(z)

    y_obs = pm.Uniform('y_obs', observed=y, lower=lower, upper=upper)
    pm.sample()

but this won’t sample because the logp of y_obs will be -inf. Are there any ways that I could rework this?

Thank you for your time!

PyMC can give you the probability of a rounded distribution directly. The equivalent to the first model in Stan is something like this (disclaimer: I didn’t run this code):

import pymc as pm

def rounded_normal_dist(mu, sigma, size):
  return pm.math.round(pm.Normal.dist(mu, sigma, size=size))

with pm.Model() as m:
  mu = pm.Flat("mu")
  sigma = pm.HalfNormal("sigma")
  y_obs = pm.CustomDist("y_obs", mu, sigma, dist=rounded_normal_dist, observed=[10, 10, 12, 11, 9])

Docstrings on CustomDist, and the dist param specifically: pymc.CustomDist — PyMC v5.13.1 documentation

Otherwise, for the second approach is a bit harder to translate. I guess you can do something by specifying an interval transform to z between y±0.5, and you don’t need a likelihood. But I’m not 100% sure on what that model means exactly in Stan.

Yup… I figured it would be something simple like this. Thank you for your help!

By default, Stan samples uniformly on values that satisfy the constraint, so this

vector<lower=y-0.5, upper=y+0.5>[N] z;

implies the following in more generative terms,

for (n in 1:N) {
  z[n] ~ uniform(y - 0.5, y + 0.5);
}

So it’s just saying the prior is that the value rounded to y is uniformly distributed between y - 0.5 and y + 0.5. This is assuming the only noise is measurement error and the rounding is to units.

1 Like