I generate the data myself, but not with a neural network of course. I generate it with a completely different set of equations that has 4-5 parameters only and looks nothing like the activations of a neural network. I am trying to simulate the results of those equations with a neural network here. So there is no true value of network weights and biases.
For sigma I sample it from a HalfNormal with a standard deviation of 0.1 so I would expect it to have a positive mean that is much smaller than 0.1 .