Junpeng’s modification improves things:
with pm.Model() as model:
theta_h = pm.Uniform("theta_h",lower=0,upper=1,shape=n)
theta_leak = pm.Beta("theta_leak",alpha=2,beta=20,shape=m,testval=0.05)
theta_i = pm.Beta("theta_i",alpha=2,beta=20,shape=n, testval = 0.05)
# Log transformed
wi = pm.math.log(theta_i)
w0 = pm.math.log(theta_leak)
AHwi = tt.dot(tt.dot(A_dense, tt.diag(theta_h)), wi)
p_v_given_h = 1 - pm.math.exp(w0 + AHwi)
v = pm.Bernoulli("v",p=p_v_given_h,shape=m,observed=observed)
trace = pm.sample()
With 1000 visible nodes and 10 hidden nodes, I’m getting about 22 it/s.
At 10000 visible and 100 hidden, sampling seems to be about 1.5 it/s.
Given that this really isn’t all that much data, I have to imagine there are ways to formulating this problem that will result in faster sampling speeds.