I have about 3 features highly correlated in my dataset(corr =0.91) and I can’t exclude them. After running Model those high corr features has wide HDI interval. Beside ppc check has a good fit and R2 about 0.94.
I’m trying to apply pm.Laplace for regularization on my beta coefficients in the model. But, according to my Project business Idea beta coeff CAN NOT be < 0. So I decided to put constraint on my beta prior. Not sure if I’m doing it correct but following code gives me error:
Bad initial energy, check any log probabilities that are inf or -inf, nan or very small:
Series([], )
My code:
with pm.Model() as model_bounded_laplace:
lam =pm.Gamma('lambda', alpha= 1, beta= 0.5, shape=len(data.columns[:-1]))
# Intercept
alpha = pm.Normal('alpha', mu=y.mean(), sd=5)
# Slope
beta_ = pm.Laplace('beta_', 0, lam, shape= len(data.columns[:-1]))
beta = pm.Deterministic('beta', lam*beta_)
# constr. on coeff
pm.Potential('constrain', tt.switch((beta<0), -np.inf, 0.))
# Error term
eps = pm.HalfCauchy('eps', 5)
# Expected value of outcome (MLR with vectors)
mu = alpha + pm.math.dot(x, beta)
# Likelihood
tune_in_i = pm.TruncatedNormal('tune_in_i',
mu= mu,
sd= eps,
lower = y.min(),
upper = y.max(),
observed= y)
# prior_bounded_l = pm.sample_prior_predictive()
# posterior/create the trace
trace_bounded_laplace = pm.sample(chains= 4, target_accept = 0.95)
It runs without pm.Potential but gives me neg. values for coeff beta and narrow down HDI for highly corr coefficients.
What do I do wrong? Should I use regularization different way? Thanks