What’s the current “baseline” way to do it that you’re looking to improve?
Edit: I couldn’t run your scan method as posted, it gave a shape error. The following worked:
import pytensor
from pymc.gp.util import stabilize
def get_f(lengthscale, z, data):
cov_i = pm.gp.cov.ExpQuad(1, ls=lengthscale).full(data)
L_i = pt.linalg.cholesky(stabilize(cov_i))
f_i = L_i @ z
return f_i
with pm.Model() as model:
x_Data = pm.Data("x_Data", x[:,None])
y_Data = pm.Data("y_Data", np.random.randn(n_x,n_id))
lengthscale = pm.Weibull("lengthscale", alpha=2.0, beta=1.0, shape=n_id)
zs = pm.Normal("z", mu=0, sigma=1, shape=(n_id, n_x))
f, _ = pytensor.scan(
fn = get_f,
sequences = [lengthscale, zs],
non_sequences=[x_Data]
)
y = pm.Normal("y", mu=f.T, sigma=1, observed=y_Data)
idata = pm.sample(compile_kwargs={'mode':'NUMBA'})
You could also try vectorizing the creation of f instead of scanning. This amounts to a compute/memory tradeoff. If you have enough memory to vectorize, it will be faster than scanning, since the computation isn’t done sequentially.
with pm.Model() as model:
x_Data = pm.Data("x_Data", x[:,None])
y_Data = pm.Data("y_Data", np.random.randn(n_x,n_id))
lengthscale = pm.Weibull("lengthscale", alpha=2.0, beta=1.0, shape=n_id)
zs = pm.Normal("z", mu=0, sigma=1, shape=(n_id, n_x))
f = pt.vectorize(get_f, signature=('(),(x),(x,o)->(x)'))(lengthscale, zs, x_Data)
y = pm.Normal("y", mu=f.T, sigma=1, observed=y_Data)
idata = pm.sample(compile_kwargs={'mode':'FAST_RUN'})
In this case I got better sampling speed with scan in the numba backend than vectorize in the C or numba backend. But vectorize was faster than scan in the C backend.