How to Set Parameters Distribution For Bayesian Neural Network With PyMC3

I have developed a Bayesian neural network model whose codes are showed as followings:

and I trained it like this:

the outcome:

On the whole, the predicted results (i.e. expected value [displayed as the orange line] ) are acceptable. However, I still have several questions about the development:

  1. Since I could not have any knowledge of the prior distribution of weighs or biases , I made them obey the normal distribution ~N(0, 1), is it reasonable? I wonder if adequate sample draws could overcome it and obtain the right posterior distribution.
  2. The out likelihood was set to obey the normal distribution, but actually it should be positive, any advise about how to find a distribution meeting the requirment?
  3. The 95% confidence interval is too wide, I wonder if it is related to the out likelihood.

Appreciate you a lot for your advices~

  1. Yes, that’s fine. A Normal prior is mathematically equivalent to an L2-regularization. Given enough evidence from the data, it will overcome the limited range of your prior. (Alternatively you can increase the sd if you feel you don’t want as much regularization)

  2. pm.HalfNormal is positive only. You could try that.

  1. How do you calculate your 95% confidence in your plot? And how do you know it is “too wide”?

Thank simon_o :grinning: .

  1. Would you provide me more information about the prior distribution setting (e.g. some literatures, weblink, etc.)?

  2. “pm.HalfNormal” seems to have no the "mean " parameters setting, which is linked to the variable “act_out”. So how could I use HalfNormal?

  3. I calculate 95% confidence like this:

  4. Based on the question #3, I have another question about how to judge the rationality of 95% confidence interval? Since the confidence interval is calculated from the posterior predictive samples, the number
    of samples for sample_ppc() may be vital for the correctness of confidence interval . And what is the principle of sample_ppc()? Does it sample from the distribution space formed by the trace (which is generated from pm.sample() )? If it does, what is the sample method used?

Look forward to your kind reply ~

Not simon_o here, but some thoughts:

  1. As mentioned previously, using a prior over the weights is equivalent to regularizing the cost function. In particular, a Gaussian prior over the weights leads to L2-regularization. Section 7.5 of “Machine Learing: A Probabilistic Perspective” by Kevin Murphy shows a good derivation of it. Here’s a good StackExchange explanation too.

  2. That’s right, HalfNormal is intentionally a truncated distribution that is strictly positive. If you really want to shift that boundary, you can just add some scalar amount to your random variable. For instance,
    mu = 5 + pm.HalfNormal('mu', sd=1) so that the boundary is now at 5 instead of 0.

  3. First, you may want to use sample_posterior_predictive() since sample_ppc() is deprecated. Second, you may find the hpd() function helpful, which calculates the credible interval for you from the samples.

  4. I don’t totally understand your question, but this example may be helpful. To my understanding (someone should correct me if I’m wrong), the function samples from the posterior distribution by running the generative model forwards. In other words, it takes the model you defined, samples from the prior, and carries on the computation through the model to output a final quantity representing a sample from the posterior.

Finally, not to be too pedantic, but you’re referring to these as “confidence intervals” when in fact these are credible intervals. It’s worth pointing out because the way these intervals are constructed and their resulting interpretations are very different.

1 Like