Hi PyMC community,
I am pretty new to the Bayesian approach and PyMC4. I am dealing with a problem which has about 30 data points and is a multivariant linear regression problem, i.e., y = a + b1 * X1 +b2 * X2 + b3 * X3, where X1, X2, X3 have individual uncertainty associated with each data point (i.e., heteroscedastic). My main goal is to study the error propagation through my multivariant linear regression function. For my application the predicted y (-10 < y <10) should only have a +/- 0.5, and I want to study for this given 0.5 error in y, how much uncertainty is allowed for each variant (X1, X2, X3).
So do deal with this I started very simple just simulating a problem of a simple linear regression y = a + b*X. The code that I wrote is basically fitting a line to the data assuming there is no uncertainty associated with the X variable. I would really be grateful if someone could help me out with these questions:
-
I do not know where I can put the errors associated with each x values
-
What if I have more than one X and slope, how could I model that?
#### toy data
x_values = np.random.uniform(low=-2, high=8, size=(30,))
x_values_uncertainty = np.random.uniform(low=0.05, high=2, size=(30,))
y_obs = np.random.uniform(low=-10, high=10, size=(30,))
with pm.Model() as model_ex1:
### intercept
intercept = pm.Normal("intercept", mu=0.85, sigma=0.3)
### slope
slope = pm.Normal("slope", mu = 0.76, sigma = 0.1)
y_model = intercept + slope * x_values
# Error term
eps = pm.HalfNormal("eps", 5)
# Likelihood
pm.Normal("obs", mu=y_model, sigma=eps, observed=y_obs)
Many thanks,
Ali (UBC)