About convergence

I would like to ask about the convergence. I don’t feel like the trains are converged based on the trace plot, but based pn the Gelman-Rubin it does not look bad. I am confused… Do I get the train converged?
{‘w_in_1’: array([[ 1.00213385, 1.00064216, 0.99963124],
[ 0.99984235, 1.00042028, 1.004435 ],
[ 0.99902311, 0.99902589, 0.99931508],
[ 1.00367372, 1.00227897, 1.00116156]]), ‘w_1_out’: array([ 1.01398552, 1.00143865, 1.03143545]), ‘hidden1_bias’: array([ 1.02987989, 1.00115633, 1.01321611]), ‘hidden_out_bias’: 0.99984746336393138, ‘reg’: array([ 0.99967364, 1.00127514, 1.00290437, …, 1.00467686,
1.00431589, 1.00392227])}

You shouldn’t just look at the Rhat, from the traceplot there are some problems with your model.
Try centering and orthogonalizing your input matrix, and also change your prior to hidden_out_bias (seems you are using an Uniform prior? That’s not recommanded usually)

Hi thanks for the reply. I will try to see how it works. And no I am using the Normal fr prior. Why did you think I was using uniform? (Is there any kind of sign or something apparently wrong that makes you think so?)

I am not sure if I understand it right, but I used sklearn.preprocessing.scale, so my input looks like this

[[-0.7406701 -0.72388723]
[-0.77610039 -0.74161164]
[-0.79381553 -0.77706047]
[-1.09497295 -1.06065109]
[-1.11268809 -1.09609992]
[-1.16583352 -1.11382434]]

Thanks a lot! :slight_smile:

The trace of hidden_out_bias is bounded with what is looking like two modes - that’s why I am asking.

And yes using sklearn to scale it usually is sufficient.

I add the mu= 3 to the bias term. Can I say the trace plot is acceptable?

Thanks a lot!

1 Like

Looks much better - are you using the default initialization?

yes, just trace = pm.sample().