Predict the mu or the observed?

hi,I have a question about the prediction,pls check the following code and the comments
#data

x_train,x_test=np.random.random((100,5)),np.random.random((30,5))
y_train,y_test=np.random.random((100,1)),np.random.random((30,1))

#train

with pm.Model() as model:
    a = pm.Normal("a", 0.0, 0.5)
    b = pm.Normal("b", 0.0, 1.0,shape=(5,1))
    mu = a + pm.math.dot(pm.MutableData("x",x_train),b)
    sigma = pm.Exponential("sigma", 1.0)

    #case1: here the prediction target is mu, which makes sense
    pm.Normal("obs", mu=pm.Deterministic("y",mu), sigma=sigma, observed=y_train)
    
    #case2: but I saw some cases is as the following
    #pm.Normal("obs", mu=mu, sigma=sigma, observed=pm.MutableData("y",y_train))
    trace = pm.sample()

#predict
#for case1,it works,for case2,raise the error “shape mismatch”

with model:
    pm.set_data({"x": x_test})
    # use the updated values and predict outcomes and probabilities:
    idata_2 = pm.sample_posterior_predictive(
        trace,
        var_names=["y"],
        return_inferencedata=True,
        predictions=True,
        #extend_inferencedata=True,
        random_seed=100,
    )
    a=idata_2.predictions["y"].mean(("chain", "draw"))

================
so,my question is where should we set the “y” for prediction target? the mu or the observed?
I am really confused many example show the target is on observed,but I did not see workable code.
and the api consitency is really a big problem,cause so many examples on website can not work!
thanks!

Your code seems to work for me. What version of pymc are you using?

Same for me, your code works fine. With PyMC 4.3.0 and Aesara 2.8.7

thanks.
yes,case1 works, but case2(turn on the comment) fails when predict,my question is where to set the ‘predict target’.

thanks,I use the 4.4.0,case1 works.I just wonder case1 is the regular code for multivarible regression?

Both cases work for me. That’s why I asked.

Same. Can you update your PyMC @Number_Huang ?

I use the 4.4.0

Wait. Do we have 4.4.0 already @cluhmann ? :sweat_smile:

Appears so: Release v4.4.0 · pymc-devs/pymc · GitHub

1 Like

Niiiiiice! Going even faster than I can keep up :sweat_smile:
So did you test the code above with 4.4.0? I didn’t yet

Works with pymc 4.4.0 for me.

1 Like

Thanks Alex,I reran it and both work. so,actually my question is which one is correct?pm.Deterministic(“y”,mu) or pm.MutableData(“y”,y_train). It seems the pm.MutableData(“y”,y_train) makes no sense,but I did see the kind of code.

Whether or not you wrap your y in a MutableData object comes down to whether you might want to swap out the original y with new data at a later stage (much like you currently are doing for x). Whether you wrap your y in a Deterministic comes down to whether you want to see sampled values of y in your trace/InferenceData object. No right or wrong.

1 Like

thanks cluhmann.really need detailed doc about the pymc underhood mechnism and updated code.thanks all anyway