Computing Credible Intervals for Predictions

I want to make sure that the prediction intervals are correctly being computed. Could the good people on this platform help me out?


def wide(X_train, y_train, model_type):
    M = X_train.shape[1]
    with pm.Model() as model:
        β =pm.Normal('β',  mu=0.0, sd=10.0, shape=M)
        β0 = pm.Normal("β0", 0, 10.)
        σ = pm.HalfNormal("σ", 2.5)
        pred = pm.Data("pred", X_train)
        mu = pm.Deterministic('mu',β0 +, β))
        obs = pm.Normal("obs",mu , σ, observed=y_train)
    return model


def predict(model,trace,  X_all):
    with model:
        # update values of predictors:
        pm.set_data({"pred": X_all})
        # use the updated values and predict outcomes and probabilities:
        posterior_predictive = pm.sample_posterior_predictive(
            trace, var_names=["mu"], random_seed=42, samples=1000, 
        ) # this should be obs in place of mu
        model_preds = posterior_predictive["mu"]  # this should be obs in place of mu
        return model_preds,posterior_predictive

Get the predicted y

import arviz as az
y_pred =model_preds.mean(axis=0)

bands =az.hdi(model_preds, hdi_prob=.95).tolist() 
upper= bands[1]
lower =bands[0]

or equivalently

lower = az.hdi(model_preds,0.95)[:,0]
upper = az.hdi(model_preds,0.95)[:,1]

Where upper and lower are the cutoffs for the upper and lower bounds for the credible intervals.

For some simple models, this seems to yield larger credible intervals than for frequentist methods. Given my rather large standard deviations, I am surprised. So wanted to check whether this makes sense.

Q1) Is the above implementation correct?
Q2) Should I use mu in place of obs to get to the predicted value. mu captures the predicted value for any particular combination of estimated parameters - adding the additional step of drawing from the likelihood (obs) seems un-necessary. On the other hand we know the variance in the distribution of y from the posterior - seems a shame to throw away this information
Q3) Is there a reason to expect the credible intervals to vary a lot between the training period and the forecast period? In both cases the credible intervals are based on the forecast values - which dependent only on the posterior samples of the coefficients - so I don’t expect the credible intervals in the treatment period to increase drastically in the forecast period (relative to training period). Does that make sense?