Binary neural network training

I’m doing a binary neural network training (weights constrained to be either 0 or 1). My model is


In JAGS it looks like

model {
  for (i in 1:length(y)) {
    y[i] ~ dbern(p[i])
    logit(p[i]) = z[i,2] - z[i,1]
    z[i,1:2] = w %*% x[i,]
  }
  for (i_output in 1:2) {
    for (j_input in 1:25) {
      w[i_output, j_input] ~ dbern(0.5)
    }
  }
}

I started simple: a vector of 25 binary values (MNIST 5x5) is fed into a network with binary weights and binary outputs - label 0 or 1 is taken as argmax softmax of the activations W * x like so:
y_predicted = argmax{ softmax(W * x) }
When I finish with this toy example, I’ll go for a full MNIST 28x28 with a true softmax (not a logit).

I’d like to port this toy model in PyMC3. I’m struggling with matrix multiplication W * x. How to implement such a model?
What I expect to see in PyMC3 is something like this:

data = pd.read_csv("https://www.dropbox.com/s/l7uppxi1wvfj45z/MNIST56_train.csv?dl=1")
x = data.loc[:, :25]  # MNIST 5x5 flatten
y = data.loc[:, 25]  # label 0 or 1

basic_model = pm.Model()  # am I right here?

with basic_model:
    # Priors for unknown model parameters
    w = pm.Bernoulli('w', p=0.5)
    logit_p = w %*% x
    y_obs = pm.Bernoulli('y_obs', logit_p=logit_p, observed=y)

    step = pm.Slice()  # what sampler to use??
    trace = pm.sample(1000, step=step, njobs=1, chains=3, n_init=1000)