Lack of convergence in a super simple toy model

It looks like I am missing something very simple. I generate a clean set of two variables with a simple linear relationship between them. Next, I use pymc3 to fit a simple linear model. My expectation is to obtain pretty precise parameter estimates, but for some reason, the credible intervals are too wide.

x = np.linspace(0, 1, 300)
y = -x + 1 + np.random.randn(len(x)) * 0.02  # can't be simpler than this

# Now, the model
def get_model(x, y):
    with pm.Model() as lm:
        a = pm.Normal('a')
        b = pm.Normal('b', mu=1)
        sig = pm.HalfNormal('sig')
        y_est = pm.Deterministic('y_est', a * x + b)
        if y is not None:
            obs = pm.Normal('obs', y_est + sig, observed=y)
    return lm


# do the sampling

with get_model(x, y) as train_model:
    train_trace = pm.sample(5000)
    az.plot_posterior(train_trace, var_names=['a', 'b', 'sig'])

The resulting plot looks like this

which, as I said, gives estimates that are much wider than what I expected. When I try to extrapolate the model and plot 90% CI of the prediction, I get this:

Which, again, is much wider.

Am I missing something?

You are not learning sigma, the likelihood should probably be:

Otherwise you’re saying that sigma should be 1 (the default when not specified in pm.Normal), and that’s why you have such a large posterior spread.

That’s also why your posterior is so irregular, the sigma and the intercept are contributing equally to the likelihood and are jointly undertermined (as if you had two intercepts)

1 Like

Well, this, indeed helped. But I don’t understand why. I expected that since I specified obs = pm.Normal('obs', y_est + sig... (note the +sig), that the sig will be part of learning. Isn’t that so?

Also, what is the conceptual difference between obs = pm.Normal('obs', y_est + sig, sig, observed=y) and your suggestion obs = pm.Normal('obs', y_est , sig, observed=y) (no +sig)?

Your original PyMC3 model corresponded to the following generative model

n = 300
x = np.linspace(0, 1, n)
y = 1 + 0.02 - x + np.random.normal(0, 1, size=n)

That would be

y = 1 + 0.02 - x + np.random.normal(0, 0.02, size=n)

That’s your original generative model:

y = 1 - x + np.random.normal(0, 0.02, size=n)

It’s just a question of how you specify things in PyMC3. The translation would have been more intuitive if you had generated your data like this:

a = 1
b = -1
sigma = 0.02

x = np.linspace(0, 1, 300)
y = np.random.normal(a + x*b, sigma)

And then it’s obvious that sigma should go into the second argument in the PyMC3 model.

2 Likes