Hi there! I had two ideas of PR for PyMC but was wondering if it would be useful:
-
I noticed lots of people over-parametrize their Multinomial-Softmax models: they put all the categories in the model, while only N-1 of them are useful and the last one is entirely determined by the others. Over-parametrizing generally slows down sampling. At the same time, the over-parametrization is more intuitive: the trick of fixing the value of one category always feels weird.
So, what do you think of modifying the API so that people can parametrize with all the categories but PyMC fixes one of them under the hood? That way, the API becomes more intuitive and the sampling speed is always optimized, without people having to artificially complicate their models. -
Same principle, but applied to the API for MvNormal with Cholesky decomposition. Doing this decomposition is quasi unavoidable for sampling efficiency, but I find the API adds boiler plate code that harms models’ readability:
sd_dist = pm.Exponential.dist(1)
packed_chol = pm.LKJCholeskyCov(f"chol_cov_{p}", eta=4, n=2, sd_dist=sd_dist)
chol = pm.expand_packed_triangular(2, packed_chol, lower=True)
# extract rho and sds:
cov = pm.math.dot(chol, chol.T)
sigma_ab = pm.Deterministic("sigma_cluster", tt.sqrt(tt.diag(cov)))
corr = tt.diag(sigma_ab ** -1).dot(cov.dot(tt.diag(sigma_ab ** -1)))
r = pm.Deterministic("Rho", corr[np.triu_indices(2, k=1)])
Having an API that does the decomposition under the hood and returns the standard deviations and correlations directly in the trace (as Stan does) would be great I think! What’s your take on that?
I’d be happy to work on these PRs! With the guidance of a mentor if possible, given the complexity of the code base.
Happy to discuss all that with you! PyMCheers