I am not sure I understand some of what you are saying. I am fairly new to PyMC3 and probabilistic programming, so excuse my ignorance. For example what do you mean by ‘true weights’? Also I don’t understand why the correct value for the posterior of the noise would be 1…
But why don’t I just show you? Here is what I get with az.summary:
(Sorry Ipython truncates the output for some reason).
And here is what I get with plot_trace:
Note that in this case I am using a BNN with 2 layers only unlike in my original question where I am using 4 layers. And here is what the predictions of the NN look like in this case if you were curious (I also added this to my question because I think this version of the plots better illustrates the issue):
The BNN captures the trend in the variance but the values are very off…
Regarding your last question:
I sample 1000 samples for ‘y’ and ‘sigma’ from the posterior. Then in my “BNN Output Variance” plot (top right in my question above) I plot 2 curves. First I take the standard deviation of ‘y’ across all 1000 samples for every value of Xnorm (=\lambda_1) using SD = np.std(samples['y'],axis=0) and plot it under the name “Total”. Then I take the average of ‘sigma’ across all 1000 samples and plot that under “Likelihood” using np.mean(samples['sigma'],axis=0). Hope this explains it. On the other hand, in the top left plot I plot the mean value of ‘y’ (blue line) and surround it with a shaded area from -2*SD to +2*SD.





