PyMC3 with Keras


I am trying to model some deep networks which I have already modelled with Keras (the usual way) and now I wish to see what would happen in the Bayesian context.

Suppose the model I have is as the following:

x_in = Input(shape=(X_train.shape[1:]))
h = Dense(10, use_bias=False)(x_in)
out = Activation('softmax')(h)
model = Model(x_in, out)

How would you go about putting priors on the weights using PyMC3. The only tutorial that I have found is this, which for my fairly limited Python knowledge seems fairly complex. I do realise that the above is a fairly simple linear model, however I can expand from here if I have some guidance.

Would highly appreciate it if someone could show me how to wrap this in a pymc3 model, if at all possible. I am using Keras 2.1.5. I can downgrade if necessary.


The one in the doc is not exactly what you are trying to do, as there it use Keras to build a neural network to approximate some function within a Bayesian model.

I am not sure what is the easiest way to initialized Keras weight (and potentially biases) with PyMC3 RVs. @ferrine has a blog post using Lasagne and gelato (a library he developed) doing exactly this, but I am not sure how easy it is in Keras.

also cc @ericmjl as he works with neural networks a bit more.


One way to do it is adding a new regularizer:
Which means you express the logp of the prior as a penalty term.


Well that would just give you the MAP estimate wouldn’t it? Unless you were referring to the fact that I can somehow feed this logp into pymc3 models somehow?


Not necessary - depending on how you do the inference.


Neural networks are nothing more than affine transforms + nonlinearity functions.

I tried doing a talk explaining it at PyData NYC, “An attempt at demystifying Bayesian deep learning.”

(On my phone right now, so I can’t conveniently get the link, but you’ll definitely be able to find it via Google.)

If you get how to write linear and logistic regression using PyMC3, then you’ll be able to see the parallels to deep learning from that talk.


I’ll loop back this afternoon (if I can remember) with an example of how to convert your keras model into a Theano + PyMC3 code. But if you get to that faster than I do, props to you! :smiley:


Here’s an implementation using PyMC3 + Theano.

import theano.tensor as tt

with pm.Model() as nn_model:
    w1 = pm.Normal('w1', 0, 1, shape=(10,))
    h =, w1)
    h = tt.nnet.softmax(h)
    like = pm.Normal('likelihood', mu=h, observed=data)

The key to understanding neural networks is that it is basically linear (affine) transforms of data + nonlinearity functions applied. Once you get this, Bayesian neural networks are nothing more than placing priors on those weights!


Hey Eric,

Thanks for that. I had followed Thomas Wiecki’s blog so I have a decent idea of how to use base theano to implement the model. Was just hoping to somehow plugin keras directly without actually writing theano code because its cleaner and also because I had already done it for ‘normal’ neural nets.

Hopefully this might be easier to do with PyMC4 if the tensorflow backend is used. Would definitely make it easier for people in neural nets to transition to Bayesian methods if that was the case.

Thanks for the pointers though!