I attempted and failed to get DensityDist to work properly. I think I am just failing to wrap my head around Aesara and the ins and outs of TensorVariables, RandomVariables, ShareVariables, Functions, etc.
The code for my attempt is here. Lines marked with “!!!” are the ones that are causing confusion. In pseudocode, what I’m trying to do is essentially
def logp(mu, rho):
q.mu.set_value(mu)
q.rho.set_value(rho)
kl = KL(q)
lam = ... # a hyperparameter set elsewhere
log_det_fisher = -2 * at.sum(at.log(at.diag(at.slinalg.cholesky(q.cov))))
# Equation (10) from our paper
return 1/2 * log_det_fisher - lam * kl.apply(f=None)
...
with pm.Model() as mixing_distribution:
pm.DensityDist("theta", # theta = variational params (could be called phi instead)
dist_params=tuple(p.get_value() for p in self.q.params),
logp = logp,
...)
mixture = pm.sample(..., model=mixing_distribution)
Here’s what I think the problems with this are, but please let me know if I’m way off base!
-
q.mu.set_valueandq.rho.set_valuearen’t working the way I expect - DensityDist expects
logpto return a tensor, but I am returning an aesara function - I may be confused about the distinction between
dist_paramsand theta - I’m violating the pymc functional style by storing
q,lam, andklas instance variables of an object (this is not shown explicitly here but is how I approached it in the more complete attempt)
Thank you in advance for any further guidance!