I am very new to using pymc and Bayesian linear regression, and I am currently following the documentation provided in pymc 5.7.0 here to predict a target value using seven predictors.These predictors are not highly correlated with the target variable. However, the results I obtained from the model are disappointing, and I am unsure about how to tune the parameters or how to initialize the priors to improve the model’s performance.
Also, I am not sure how to use this model to predict the unseen data and plot HDI for them.
I appreciate any help or advice!
Here are the snapshot of the results, and the data:
N, D = x_train.shape
D0 = int(D / 2)
import pytensor.tensor as at
with pm.Model(coords={"predictors": x_train.columns.values}) as test_drn_model:
# Prior on error SD
sigma = pm.HalfNormal("sigma", 20)
# Global shrinkage prior
tau = pm.HalfStudentT("tau", 2, D0 / (D - D0) * sigma / np.sqrt(N))
# Local shrinkage prior
lam = pm.HalfStudentT("lam", 2, dims="predictors")
c2 = pm.InverseGamma("c2", 1, 0.1)
z = pm.Normal("z", 0.0, 1.0, dims="predictors")
# Shrunken coefficients
beta = pm.Deterministic(
"beta", z * tau * lam * at.sqrt(c2 / (c2 + tau**2 * lam**2)), dims="predictors")
# No shrinkage on intercept
beta0 = pm.Normal("beta0", 100, 25.0)
Value = pm.Normal("Value", beta0 + at.dot(x_train.values, beta), sigma, observed=y_train.values)
with test_drn_model:
idata = pm.sample(1000, tune=2000, random_seed=42, target_accept=0.99)