ZeroSumNormal - references?

DrEntropy · August 30, 2025, 3:12pm

Is there a good article to read to understand the math behind the implementation of PYMC’s ZeroSumNormal? I could not find a citation in the code or docs.

Thanks!

jessegrabowski · August 31, 2025, 1:32pm

The best citation is this random discussion thread on the numpyro github page of all places.

Bug @aseyboldt to write more papers.

DrEntropy · August 31, 2025, 3:30pm

Thanks! The discussion there was quite helpful! And I look forward to the paper… nudge nudge @aseyboldt

lucianopaz · September 1, 2025, 8:49am

There’s also the section on degenerate multivariate normals from wikipedia that’s quite helpful for understanding how to compute the logp

mitzimorris · September 2, 2025, 7:23pm

Stan implemented @aseyboltaseybolt’s implementation as constrained parameter type ‘zero_sum_vector’ and I wrote a case study which goes through the math; the Stan reference manual also goes over this: Constraint Transforms. here’s the case-study: The Sum-to-Zero Constraint in Stan

DrEntropy · September 2, 2025, 7:52pm

Ah, this is very useful! I am also a Stan user so this will help.

tcapretto · September 15, 2025, 12:06am

This may also be helpful:

Normal distributed random variables with constraint? - Cross Validated

See the response and the reference in there to Kwong and Iglewicz 1996

DrEntropy · September 15, 2025, 3:46am

Indeed it is helpful, thanks!

bob-carpenter · September 17, 2025, 7:49pm

This is not how PyMC and Stan are defining sum-to-zero variables. We’re instead defining a new density. The easiest way to think about it is in terms of a change-of-variables. To have an N-vector that sums to zero, you only have N - 1 degrees of freedom—the last value has to be equal to the negative sum of the previous values.

A naive way to enforce this constraint is just this way. Use N - 1 free unconstrained variables and define the N-th variable to be the negation of the sum of the first N-1 values. Because it’s a linear constraint, the Jacobian is constant, so can be ignored for the purposes of sampling. The unnormalized density is

p(x) = \left(\prod_{n=1}^{N-1} \textrm{normal}(x_n \mid 0, \sigma)\right) \cdot \textrm{normal}(-\textrm{sum}(x_1, ..., x_{N-1}) \mid 0, \sigma).

The sampler only works over variables x_1,\ldots, x_{N-1}—the variable x_N is defined as a transform and plays a role in the density This density is not something that has marginal scale \sigma—the actual scales are slightly smaller due to the constraint (by a factor of \sqrt{(N - 1) / N}).

Despite being what we recommended for years for Stan, this turns out to not be a great way to do this transform geometrically. @aseyboldt and @spinkney figured out how to do this much better using the first half of the isometric log ratio transform. The isometric part is that it keeps the variables relatively independent in the unconstrained space where we sample.

The full derivation is is available in the Stan Reference Manual section on sum-to-zero constraints. It has worked way better than the naive approach in all of the examples where we’ve tried it. Thanks, @aseyboldt and @spinkney !

spinkney · September 17, 2025, 8:24pm

Thanks @bob-carpenter!

Some clarification on:

Both PyMC and Stan use orthogonal transformations but PyMC uses Householder reflectors while Stan uses Helmert transforms (which underpin the isometric log ratio transform). Any orthogonal transformation matrix works here and we can convert Helmert to Householder and back with a suitable permutation matrix.

I think the Stan and PyMC versions will have similar speed and memory costs for the vector case. The matrix case was hand crafted in Stan to only do one double loop without forming any additional matrices. I think the householder one has a bit more computation but still doesn’t need to form additional matrices. The generalization to dimensions > 2 is already in PyMC while not in Stan (yet).

bob-carpenter · September 17, 2025, 9:00pm

Thanks!

Topic		Replies	Views
Trying to use ZeroSUMNormal and not understanding results v5 modeling	8	776	May 2, 2023
Trying to understand the ZeroSumNormal distribution v5 modeling	0	491	January 24, 2023
Ordered transform internals development , bug	18	215	August 20, 2024
Did anybody replicate the ADVI examples in the "Automatic Variational Inference in Stan" paper? Questions	4	1807	April 9, 2024
Constrain the sum of two categorical variables Questions modeling	3	816	January 24, 2022

ZeroSumNormal - references?

Related topics