Lack of convergence in a super simple toy model

IvanSanych · July 15, 2021, 5:12am

It looks like I am missing something very simple. I generate a clean set of two variables with a simple linear relationship between them. Next, I use pymc3 to fit a simple linear model. My expectation is to obtain pretty precise parameter estimates, but for some reason, the credible intervals are too wide.

x = np.linspace(0, 1, 300)
y = -x + 1 + np.random.randn(len(x)) * 0.02  # can't be simpler than this

# Now, the model
def get_model(x, y):
    with pm.Model() as lm:
        a = pm.Normal('a')
        b = pm.Normal('b', mu=1)
        sig = pm.HalfNormal('sig')
        y_est = pm.Deterministic('y_est', a * x + b)
        if y is not None:
            obs = pm.Normal('obs', y_est + sig, observed=y)
    return lm


# do the sampling

with get_model(x, y) as train_model:
    train_trace = pm.sample(5000)
    az.plot_posterior(train_trace, var_names=['a', 'b', 'sig'])

The resulting plot looks like this

which, as I said, gives estimates that are much wider than what I expected. When I try to extrapolate the model and plot 90% CI of the prediction, I get this:

Which, again, is much wider.

Am I missing something?

ricardoV94 · July 15, 2021, 8:09am

You are not learning sigma, the likelihood should probably be:

Otherwise you’re saying that sigma should be 1 (the default when not specified in pm.Normal), and that’s why you have such a large posterior spread.

That’s also why your posterior is so irregular, the sigma and the intercept are contributing equally to the likelihood and are jointly undertermined (as if you had two intercepts)

IvanSanych · July 17, 2021, 9:23am

Well, this, indeed helped. But I don’t understand why. I expected that since I specified obs = pm.Normal('obs', y_est + sig... (note the +sig), that the sig will be part of learning. Isn’t that so?

Also, what is the conceptual difference between obs = pm.Normal('obs', y_est + sig, sig, observed=y) and your suggestion obs = pm.Normal('obs', y_est , sig, observed=y) (no +sig)?

ricardoV94 · July 17, 2021, 6:17pm

Your original PyMC3 model corresponded to the following generative model

n = 300
x = np.linspace(0, 1, n)
y = 1 + 0.02 - x + np.random.normal(0, 1, size=n)

That would be

y = 1 + 0.02 - x + np.random.normal(0, 0.02, size=n)

That’s your original generative model:

y = 1 - x + np.random.normal(0, 0.02, size=n)

It’s just a question of how you specify things in PyMC3. The translation would have been more intuitive if you had generated your data like this:

a = 1
b = -1
sigma = 0.02

x = np.linspace(0, 1, 300)
y = np.random.normal(a + x*b, sigma)

And then it’s obvious that sigma should go into the second argument in the PyMC3 model.

Topic		Replies	Views
Observational error in logistic regression v5	6	376	March 2, 2023
Seeds: Random effect logistic regression Questions	31	1274	March 29, 2022
The example doesn't converge well and posteriors fluctuated Questions	3	561	March 21, 2019
The 95% interval of PyMC3 trace doesn't cove the real values Questions	8	1831	July 21, 2017
A little change to the sample code of MAP, however Questions	10	1118	April 1, 2018

Lack of convergence in a super simple toy model

Related topics