I have multivariate normal data which is correlated according to a DAG structure. I’ve been struggling to debug a specific error ValueError: MulSD.grad returned a term with 2 dimensions, but 0 are required.
Here’s my model:
# Model for y ~ N(μ, Λ^-1) where Λ = (I - γW)^T Ω (I - γW)
# and W is a sparse directed graph, γ is a off-diagonal precision parameter
# Number of rows is `n` and number of covariates is `p`
with pm.Model() as model:
ω = pm.HalfNormal('ω', shape=n)
β = pm.Normal("β", shape=p)
γ = pm.HalfNormal("γ")
μ = X @ β
δ = pm.math.constant(y) - μ
I = sparse.square_diagonal(pt.ones(n))
W = sparse.as_sparse_or_tensor_variable(W_scipy) * γ
IminusW = sparse.sub(I, W)
Ω = sparse.square_diagonal(ω)
Λ = sparse.basic.mul(sparse.basic.mul(IminusW, Ω,), IminusW)
Λδ = sparse.dot(Λ, δ[:, None])
quaddist = pt.dot(δ[None, :],Λδ).squeeze() # works with quaddist = pt.dot(δ, δ)
logdet = pt.sum(pt.log(ω))
norm = -0.5 * pm.math.constant(n) * np.log(2 * np.pi) # log(2 * pi)
logp = norm - 0.5 * quaddist + 0.5 * logdet
pm.Potential('likelihood', logp)
trace = pm.sample(chains=1, cores=1, tune=5, draws=5)
Any advice is greatly appreciated!
Here’s a notebook with a runnable example showing what I’m trying to do:
Looks like a bug in the gradient of MulSD (sparse * dense). pt.grad(W.sum(), γ) is sufficient to trigger the error. My guess is that this is related to broadcasting. Pytensor knows the output of this should be the shape of W, but then it sees that γ is a scalar.
Note the expand dims to make x.ndim == 2. I suspect we should also be doing this dimension broadcasting in the sparse case, at least in the gradients.
If you do the expand_dims yourself, it bypasses the error. But it would still be nice if you opened an issue Just kidding, it makes the symbolic gradients work, but leads to a new error when trying to evaluate the graph.
W = sparse.as_sparse_or_tensor_variable(W_scipy) * pt.expand_dims(γ, [0, 1])