I’m a newcomer to PyMC and am trying to fit the means and covariance matrix of a 3D Gaussian. I’m assuming a Uniform prior on each of the 3 means, and an LKJCholeskyCov prior on the covariance matrix with n=3 and eta=2.

My forward model involves the following sequence of steps:

- Evaluate the probabilities of drawing each of 1 million pairs (x,y,z) given the current values of the 3D Gaussian’s mean vector and covariance matrix. The 1 million pairs are pre-defined.
- Do a tensordot of those resulting 1 million probabilities with 1 million 2D (20x20) maps corresponding to each of those 1 million (x,y,z) pairs. This would result in a single weighted 2D (20x20) map.
- Use a Poisson likelihood to compare the expected counts from that weighted 2D map with an actual observed 2D map that I have.

The following code works well (though it takes several hours) – here I have off-loaded the above sequence of steps into a separate function decorated with `pytensor.compile.ops.as_op`

which makes use of scipy/numpy functions:

```
@as_op(itypes=[pt.dvector,pt.dmatrix], otypes=[pt.dmatrix])
def pytensor_forward_model(theta_mu, theta_cov):
mvn = scipy.stats.multivariate_normal(mean=theta_mu,cov=theta_cov,allow_singular=True)
probs = mvn.pdf(points)
weighted_pdf_2d = np.tensordot(pdfs_2d,probs,axes=1).reshape((20,20))
weighted_pdf_2d = weighted_pdf_2d / (np.sum(weighted_pdf_2d) * bin_area)
return weighted_pdf_2d
with pm.Model() as model:
mu_x = pm.Uniform('mu_x', lower=0.01,upper=0.99)
mu_y = pm.Uniform('mu_y', lower=0.01,upper=0.99)
mu_z = pm.Uniform('mu_z', lower=-1,upper=1)
mu_vector = pt.stack([mu_x,mu_y,mu_z])
chol, corr, stds = pm.LKJCholeskyCov('chol', n=3, eta=2, sd_dist=pm.Exponential.dist(1.0,shape=3))
cov_matrix = pm.Deterministic('cov', chol.dot(chol.T))
mu_pdf = pytensor_forward_model(mu_vector, cov_matrix)
z_obs = pm.Poisson('z_obs', mu=mu_pdf*bin_area*poisson_hist2d.sum(), observed=poisson_hist2d)
```

When I do `pm.sample()`

, PyMC automatically assigns a Slice sampler for each of the 3 mean RVs, and Metropolis for `chol`

, and the code takes ~10 hours to run, which is I guess not bad given the gigantic tensordots involved. Using %timeit shows that pytensor_forward_model() takes ~50 ms per call.

In an effort to try to make it run faster, I rewrote the above to use purely pymc+pytensor operations hoping that maybe the pre-compilation + NUTS sampler would decrease the runtime. But the progress bar says ~100 hours… It’s also very slow if I specify the auto Slice+Metropolis sampler combo.

Here is my pure pymc+pytensor version of pytensor_forward_model() – these lines directly replace the `mu_pdf = pytensor_forward_model()`

line inside the with context (negating the need for the separate @as_op decorated function):

```
mvn = pm.MvNormal.dist(mu=mu_vector,chol=chol)
probs = pm.Deterministic('probs', pt.exp(pm.logp(mvn,points_tensor)))
weighted_pdf_2d = pm.Deterministic('weighted_pdf_2d',pt.tensordot(pdfs_2d,probs,axes=1).reshape((20,20)))
mu_pdf = pm.Deterministic('mu_pdf',weighted_pdf_2d / (pt.sum(weighted_pdf_2d) * bin_area))
```

where I created `points_tensor = pt.as_tensor_variable(points)`

right before/above the with context, otherwise pm.logp(mvn,points) complains that points isn’t a pytensor variable.

Did I do something wrong in the pure pytensor approach? Is it just slow because of my priors? Are there obvious ways to speed it up that I’m missing?