Hi there
I am using pymc3 to infer the parameters for brain models. For this, I run a simulation in each sample step using thano.scan. Unfortunately, the sampling becomes very slow when increasing the simulation length. I suppose this could be due to the theano.scan function, but I am not 100% sure.
This is the setup I use:
with pm.Model():
dt = theano.shared(0.1, name="dt")
x_init_star = pm.Normal("x_init_star", mu=0.0, sd=1.0, shape=shape[1:])
x_init = pm.Deterministic("x_init", 0.0 + x_init_star)
BoundedNormal = pm.Bound(pm.Normal, lower=0.0)
noise = BoundedNormal("noise", mu=0.0, sd=1.0)
amplitude_star = pm.Normal("amplitude_star", mu=0.0, sd=1.0)
amplitude = pm.Deterministic("amplitude", 1.0 + amplitude_star)
offset_star = pm.Normal("offset_star", mu=0.0, sd=1.0)
offset = pm.Deterministic("offset", 0.0 + offset_star)
epsilon = BoundedNormal("epsilon", mu=0.0, sd=1.0)
x_t = pm.Normal(name="x_t", mu=0.0, sd=1.0, shape=shape)
x_sim, updates = theano.scan(fn=scheme, sequences=[x_t], outputs_info=[x_init], n_steps=shape[0])
x_hat = pm.Deterministic(name="x_hat", var=amplitude * x_sim + offset)
x_obs = pm.Normal(name="x_obs", mu=x_hat, sd=epsilon, shape=shape, observed=obs)
def scheme(self, x_eta, x_prev):
x_next = x_prev + dt * dfun(x_prev, params) + tt.sqrt(dt) * x_eta * noise
return x_next
The dfun
function is just a brain model specific function and defines how a next step is updated. params
are the corresponding model parameters, which I define before as pymc distributions.
For shape=(3001, 2, 1, 1)
my setup will need approximately 12 hours using pm.sample(draws=500, tune=500, cores=2)
.
As mentioned above, I suppose it is because of theano.scan
. Or could it be that sampling x_t
is inefficient because of the large size?
I am using:
pymc3 version: 3.11.5
theano-pymc version: 1.1.2
python version: 3.7.12
Operating System: macOS Big Sur 11.6 with Apple M1
Thank you for your answers