Excessively slow evaluation?

I’m doing some visualisation of posterior predictions, and I’m calling a function which is used in the PyMC3 model itself. I’m noticing that the evaluation of this function is excessively slow. So I’m wondering if this is reasonable or a bit odd that it takes this long.

import pymc3 as pm
import numpy as np
import matplotlib.pyplot as plt
import time

def my_func(x, θ):
    return pm.math.exp(-θ * x)

# faux posterior samples
n_samples = 1000
θ = np.random.randn(n_samples)

xi = np.linspace(-1, 2, 100)

# how long to evaluate the function 100 times? ----------------------
start = time.time()
for i in np.random.choice(n_samples, 100, replace=False):
    my_func(xi, θ[i]).eval()
end = time.time()
print(end - start)

# how long to calculate and plot? -----------------------------------
fig, ax = plt.subplots()
start = time.time()
for i in np.random.choice(n_samples, 100, replace=False):
    ax.plot(xi, my_func(xi, θ[i]).eval(), c="k", alpha=0.1)
# plot posterior mean
ax.plot(xi, my_func(xi, np.mean(θ)).eval(), c="r", lw=3)
end = time.time()
print(end - start)

Results in

  • 8.2 seconds to evaluate
  • 8.0 seconds to evaluate and plot

But if run with a numpy function then it’s way faster

def my_np_func(x, θ):
    return np.exp(-θ * x)

# how long to evaluate the numpy function 100 times? ----------------
start = time.time()
for i in np.random.choice(n_samples, 100, replace=False):
    my_np_func(xi, θ[i])
end = time.time()
print(end - start)

# how long to calculate and plot? -----------------------------------
fig, ax = plt.subplots()
start = time.time()
for i in np.random.choice(n_samples, 100, replace=False):
    ax.plot(xi, my_np_func(xi, θ[i]), c="k", alpha=0.1)
# plot posterior mean
ax.plot(xi, my_np_func(xi, np.mean(θ)), c="r", lw=3)
end = time.time()
print(end - start)

Resulting in

  • 0.0032 seconds for evaluation only
  • 0.03 seconds for evaluation and plotting

Any thoughts on whether this is normal, or what might be done to improve it? Note, my_func is just a random example here, but it’s a helper function called by a PyMC3 model, which you also want to

I would guess the slowdown is probably because theano is recreating the function for evaluation in every iteration of loop. This shouldn’t happen inside your pymc model (where it is compiled only once, together with the rest of the model logp).

Here is the performance with the compiled only once func:

import theano
import theano.tensor as tt

import pymc3 as pm
import numpy as np
import matplotlib.pyplot as plt
import time

def my_func(x, θ):
    return pm.math.exp(-θ * x)

def compile_my_func():
    x = tt.vector()
    θ = tt.scalar()
    result = my_func(x, θ)
    return theano.function([x, θ], result)

# faux posterior samples
n_samples = 1000
θ = np.random.randn(n_samples)

xi = np.linspace(-1, 2, 100)

# how long to evaluate the function 100 times? ----------------------
start = time.time()
for i in np.random.choice(n_samples, 100, replace=False):
    my_func(xi, θ[i]).eval()
end = time.time()
print(end - start)  # 5 seconds on Colab

# how long to evaluate the function 100 times? ----------------------
start = time.time()
my_func_compiled = compile_my_func()
for i in np.random.choice(n_samples, 100, replace=False):
    my_func_compiled(xi, θ[i])
end = time.time()
print(end - start)  # 0.03 seconds on Colab
1 Like