Hi, I had made a simple linear regression model that predicts the mean response ‘mu’ and the observed score with mean mu and standard deviation sigma, as follows:
This model works fine without error
with pm.Model() as batting_score_model:
batting_score_model.add_coord('observations', np.arange(X_train.shape[0]), mutable = True)
batting_score_model.add_coord('predictors', X_train.columns.values, mutable = True)
X_data = pm.Data("X_data", X_train.values, dims=('observations','predictors'), mutable=True)
batting_scores_data = pm.Data("batting_scores_data",y_train.values, dims='observations', mutable=True)
# Prior on error SD
sigma = pm.HalfNormal("sigma", 25)
#priors for mu
beta0 = pm.Normal("beta0", 10, 25.0)
beta = pm.Normal(
"beta",0,10,dims="predictors"
)
# Model mean
mu = pm.Deterministic("mu", beta0 + pt.dot(X_data, beta), dims='observations')
# Likelihood
batting_scores = pm.Normal("batting_scores", mu=mu, sigma=sigma, observed=batting_scores_data, dims='observations')
trace = pm.sample(4000,tune=2000, random_seed=42)
However, I want to add a varying uncertainty, taking this chapter of the book as reference:
https://bayesiancomputationbook.com/markdown/chp_04.html#varying-uncertainty
Which essentially adds a sigma modelling and the resulting model is as follows:
with pm.Model() as batting_score_model:
batting_score_model.add_coord('observations', np.arange(X_train.shape[0]), mutable = True)
batting_score_model.add_coord('predictors', X_train.columns.values, mutable = True)
X_data = pm.Data("X_data", X_train.values, dims=('observations','predictors'), mutable=True)
batting_scores_data = pm.Data("batting_scores_data",y_train.values, dims='observations', mutable=True)
#priors for mu
beta0 = pm.Normal("beta0", 10, 25.0)
beta = pm.Normal("beta",0,10,dims="predictors")
#priors for sigma
beta_error = pm.HalfNormal("beta_error", 5 , dims="predictors")
beta_error_0 = pm.HalfNormal("beta_error_0", 45)
# Model mean
mu = pm.Deterministic("mu", beta0 + pt.dot(X_data, beta), dims='observations')
sigma = pm.Deterministic("sigma", beta_error_0 + pt.dot(X_data, beta_error), dims='observations')
# Likelihood
batting_scores = pm.Normal("batting_scores", mu=mu, sigma=sigma, observed=batting_scores_data, dims='observations')
trace = pm.sample(4000,tune=2000, random_seed=42)
But here I get a sampling error when running the trace.
SamplingError: Initial evaluation of model at starting point failed!
Starting values:
{‘beta0’: array(9.76088738), ‘beta’: array([ 0.77992321, 0.47485307, 0.21373947, 0.3663738 , -0.95693909,
-0.05984036, -0.31696707, -0.48746525, 0.53155501, 0.22901235,
0.72273303, -0.91572198, -0.53205798, -0.57490071, -0.32478917,
-0.40747462, -0.2126991 , -0.07635044, -0.43753917, 0.15456078,
0.04079334, -0.0699782 , -0.62245304, -0.69001721, 0.91382508,
-0.68109486, 0.84212115, -0.1974131 , 0.98941664, -0.13566268,
-0.20309392, -0.71379639, 0.98054805]), ‘beta_error_log__’: array([0.7997297 , 2.26765776, 1.14193216, 2.21469177, 1.31672181,
2.26485091, 0.78760795, 1.32772613, 1.21990339, 1.67068799,
1.56824209, 1.41174254, 1.69074455, 2.38133294, 2.5834459 ,
2.59560158, 0.67507601, 2.30329194, 1.30781075, 2.11791023,
0.61329441, 1.68887444, 0.91382722, 1.98241153, 1.79008021,
2.13389145, 1.62900647, 0.91731868, 1.03044865, 1.85202229,
2.18212492, 0.90582926, 2.08700102]), ‘beta_error_0_log__’: array(4.08171827)}Logp initial evaluation results:
{‘beta0’: -4.14, ‘beta’: -106.36, ‘beta_error’: -37.49, ‘beta_error_0’: -0.82, ‘batting_scores’: -inf}
You can callmodel.debug()
for more details.
I’ve carefully followed the implementation from the reference above but I still get this error where my y variable batting_scores becomes -infinity and triggers the sampling error. I am not sure why this happen?
Thanks for helping!