Hi everyone,
I’m have a 2D dataset containing a linear relationship + gamma noise, like what follows:
import numpy as np
length = 200
rng = np.random.default_rng(1337)
x = np.linspace(1e-2, 1, num=length)
true_regression_line = 5 * x + 4
y = true_regression_line + rng.gamma(0.5, 0.2, size=length)
I can fit an ordinary linear regression with Gaussian noise with the following code:
with pm.Model() as guassian_model:
sigma = pm.HalfCauchy("sigma", beta=10)
intercept = pm.Normal("intercept", 0, sigma=20)
slope = pm.Normal("slope", 0, sigma=20)
likelihood = pm.Normal("y", mu=intercept + slope * x, sigma=sigma, observed=y)
map_estimate = pm.find_MAP()
resulting in the following fit:
So far so good. I’m now trying to improve the fit by recognising that the noise is Gamma-distributed, and replacing the Gaussian likelihood from the model above. I found the following code snippet that claims to achieve this:
with pm.Model() as gamma_model:
a = pm.Normal("a", 0, 0.5)
b = pm.Normal("b", 0, 0.5)
shape = pm.Uniform("shape", 0, 50)
μ = pm.math.exp(a + b * x)
likelihood = pm.Gamma("y", alpha=shape, beta=shape / μ, observed=y)
map_estimate = pm.find_MAP(progressbar=False)
which results in the following fit, which I find a bit disappointing, although I may be missing the point:
I was expecting a line closer to the true regression line, and suspect that there’s something odd with gamma_model
, as the Gamma noise must be >0 and that doesn’t seem to be the case with the fitted line in the plot. In all honesty, I don’t follow all the details of gamma_model
—is it using the appropriate link function for the Gamma distribution?
Thanks for your time, any tips appreciated!