Sampling error y inf when adding sigma heteroskedastic modelling

ArnoSixer · November 11, 2024, 12:00pm

Hi, I had made a simple linear regression model that predicts the mean response ‘mu’ and the observed score with mean mu and standard deviation sigma, as follows:

This model works fine without error

with pm.Model() as batting_score_model:
    batting_score_model.add_coord('observations', np.arange(X_train.shape[0]), mutable = True)
    batting_score_model.add_coord('predictors', X_train.columns.values, mutable = True)

    
    X_data = pm.Data("X_data", X_train.values, dims=('observations','predictors'), mutable=True)
    batting_scores_data = pm.Data("batting_scores_data",y_train.values, dims='observations', mutable=True)

    # Prior on error SD
    sigma = pm.HalfNormal("sigma", 25)

#priors for mu
    beta0 = pm.Normal("beta0", 10, 25.0)

    beta = pm.Normal(
        "beta",0,10,dims="predictors"
    )

    # Model mean
    mu = pm.Deterministic("mu", beta0 + pt.dot(X_data, beta), dims='observations')

    # Likelihood
    batting_scores = pm.Normal("batting_scores", mu=mu, sigma=sigma, observed=batting_scores_data, dims='observations')

trace = pm.sample(4000,tune=2000, random_seed=42)

However, I want to add a varying uncertainty, taking this chapter of the book as reference:
https://bayesiancomputationbook.com/markdown/chp_04.html#varying-uncertainty

Which essentially adds a sigma modelling and the resulting model is as follows:

with pm.Model() as batting_score_model:
    batting_score_model.add_coord('observations', np.arange(X_train.shape[0]), mutable = True)
    batting_score_model.add_coord('predictors', X_train.columns.values, mutable = True)

    
    X_data = pm.Data("X_data", X_train.values, dims=('observations','predictors'), mutable=True)
    batting_scores_data = pm.Data("batting_scores_data",y_train.values, dims='observations', mutable=True)

    #priors for mu
    beta0 = pm.Normal("beta0", 10, 25.0)

    beta = pm.Normal("beta",0,10,dims="predictors")

    #priors for sigma
    beta_error = pm.HalfNormal("beta_error", 5 , dims="predictors")

    beta_error_0 = pm.HalfNormal("beta_error_0", 45)
    
    # Model mean
    mu = pm.Deterministic("mu", beta0 + pt.dot(X_data, beta), dims='observations')
    sigma = pm.Deterministic("sigma", beta_error_0 + pt.dot(X_data, beta_error), dims='observations')
    
    # Likelihood
    batting_scores = pm.Normal("batting_scores", mu=mu, sigma=sigma, observed=batting_scores_data, dims='observations')

trace = pm.sample(4000,tune=2000, random_seed=42)

But here I get a sampling error when running the trace.

SamplingError: Initial evaluation of model at starting point failed!
Starting values:
{‘beta0’: array(9.76088738), ‘beta’: array([ 0.77992321, 0.47485307, 0.21373947, 0.3663738 , -0.95693909,
-0.05984036, -0.31696707, -0.48746525, 0.53155501, 0.22901235,
0.72273303, -0.91572198, -0.53205798, -0.57490071, -0.32478917,
-0.40747462, -0.2126991 , -0.07635044, -0.43753917, 0.15456078,
0.04079334, -0.0699782 , -0.62245304, -0.69001721, 0.91382508,
-0.68109486, 0.84212115, -0.1974131 , 0.98941664, -0.13566268,
-0.20309392, -0.71379639, 0.98054805]), ‘beta_error_log__’: array([0.7997297 , 2.26765776, 1.14193216, 2.21469177, 1.31672181,
2.26485091, 0.78760795, 1.32772613, 1.21990339, 1.67068799,
1.56824209, 1.41174254, 1.69074455, 2.38133294, 2.5834459 ,
2.59560158, 0.67507601, 2.30329194, 1.30781075, 2.11791023,
0.61329441, 1.68887444, 0.91382722, 1.98241153, 1.79008021,
2.13389145, 1.62900647, 0.91731868, 1.03044865, 1.85202229,
2.18212492, 0.90582926, 2.08700102]), ‘beta_error_0_log__’: array(4.08171827)}

Logp initial evaluation results:
{‘beta0’: -4.14, ‘beta’: -106.36, ‘beta_error’: -37.49, ‘beta_error_0’: -0.82, ‘batting_scores’: -inf}
You can call model.debug() for more details.

I’ve carefully followed the implementation from the reference above but I still get this error where my y variable batting_scores becomes -infinity and triggers the sampling error. I am not sure why this happen?

Thanks for helping!

jessegrabowski · November 11, 2024, 1:08pm

What do prior draws for sigma look like?

ArnoSixer · November 11, 2024, 2:07pm

Thanks for helping.

Trying to get prior predictive samples for the model also yields an error: ValueError: scale < 0

Does it mean that some of my sigma estimates end up being negative and therefore throwing an error? I did specify half normal for sigma priors tho which should not produce negative values.

jessegrabowski · November 11, 2024, 2:09pm

Does it mean that some of my sigma estimates end up being negative and therefore throwing an error?

Yes, that sounds like the problem.

What about X_data, is it always strictly positive?

In general I would recommend using a link function to keep a model on sigma positive, rather than specifying strictly positive priors. exp or softplus are both good choices.

Riaan_Pienaar · November 13, 2024, 12:42pm

‘batting_scores’: -inf

I had a similar issue recently and turned out it was because I had to downsample my observed data which caused some values to go out of the model’s boundaries. Any unexpected values in your observations?

AlexAndorra · November 14, 2024, 10:50pm

Yeah I think you need a softplus link, as Jesse said, to make sure sigma stays positive

ArnoSixer · November 15, 2024, 8:25am

My X_data could actually be negative sometimes although most of it is positive.

Using an softplus link function solved the problem! So yes, the signa values ended up becoming negative and ensuring they are not solve the issues.

Thanks guys!

I’m just wondering, do you think using a link function somehow change the estimate of sigma tho? Do we need to ‘unlink’ sigma somewhere?

jessegrabowski · November 15, 2024, 8:29am

It won’t change the top-line estimate of sigma, but it will change the estimate of the parameters that go into sigma. You will need to be careful when thinking about what these mean. You can see this discussion for some ideas.

AlexAndorra · November 15, 2024, 12:45pm

Indeed. Note that softplus is linear when x is positive though, so that makes all of this easier most of the time (in comparison to exp for instance)

jessegrabowski · November 15, 2024, 12:55pm

Yeah great point. I’m not sure why softplus isn’t a more popular link function

AlexAndorra · November 15, 2024, 1:14pm

True! Maybe it’s somewhat new? I personally discovered it recently, while teaching a workshop

ArnoSixer · November 15, 2024, 1:32pm

True, I had actually tried an exponential function but it made by estimates so huge that it created another error, using softplus went smoothly.

Topic		Replies	Views
Bayesian Polynomial Regression with Uncertainties in X and Y v5 prediction	0	371	September 27, 2023
Building a normal likelihood with known but varying variance for every data point/sample Questions	1	395	February 20, 2021
Model Fits with no issues but getting an error when sampling ppc v5 modeling	4	367	July 11, 2023
Why would likelihood be -inf in pm.sample() but not in pm.sample_prior_predictive()? v5 modeling	3	681	July 14, 2022
Setting new data for predictions, conflicting size with dims version agnostic	3	134	November 8, 2024

Sampling error y inf when adding sigma heteroskedastic modelling

Related topics