Yes, there is a python loop involved but it’s kind of hidden.
The deterministic, P
, is a theano tensor variable. To execute the operation and get a concrete value back, we first compile the expression to a callable function. That’s where _compile_theano_function
comes in. In your case, basically you get a function that looks like P=f(logP)
. A common complication with rv shapes is that theano is very picky with the number of dimensions of the tensor variables. As you’ve written down, P
and logP
are scalars, which means that they have ndim=0
. If you try to pass a vector of logP
values as input to f
, you’ll get a ValueError
.
The way we get draw_values
to work under these circumstances is to wrap f
with numpy.vectorize
using a signature that is made aware of the dimensions of the theano tensors. In the end, numpy is responsible for looping inside of vectorize
.
You ask, why do we have to vectorize like this? The answer is because the pymc3 distributions are not pure theano Op
s, and random
has to work around a lot of theano and numpy interaction which ends up being slower.
How can you improve prior predictive speed? You can make a separate model for prior predictive sampling that gets the RV’s shapes in a way to get the deterministic you are interested in to be computed completely in theano and not relying on numpy.vectorize
like this:
import time
import numpy as np
import pymc3 as pm
from matplotlib import pyplot as plt
times1 = []
times2 = []
sizes = 2 ** np.arange(1, 20+1)
with pm.Model() as model1:
logP = pm.Uniform("logP", 0, 4)
P = pm.Deterministic("P", 10**logP)
for size in sizes:
t1 = time.time()
samples1 = pm.distributions.distribution.draw_values([model1.P], size=size)
times1.append(time.time() - t1)
with pm.Model() as model2:
logP = pm.Uniform("logP", 0, 4, shape=(size,))
P = pm.Deterministic("P", 10**logP)
t2 = time.time()
samples2 = pm.distributions.distribution.draw_values([model2.P])
times2.append(time.time() - t2)
assert samples1[0].shape == samples2[0].shape
plt.loglog(sizes, times1, '-ob')
plt.loglog(sizes, times2, '-or')
This gives you the following sampling times as a function of sample sizes:
The constant sampling speed you see for the first red dots is the theano compilation overhead in _compile_theano_function
. You only see this overhead in the blue curve’s first point because the compiled function is cached for future use.