Nuts-rs: meaning of the parameters in the CpuLogpFunc

ricardoV94 · June 30, 2024, 4:58pm

Yeah I meant sufficient statistics. Thanks for bringing it up

ricardoV94 · June 30, 2024, 5:04pm

Regarding static graph, you can always use a black box model in PyMC, you’ll just have to handle the gradients yourself or use gradient free methods.

Also even static graphs can be pretty “fluid”, they can vary in dimension length and can use conditionals and loops with near arbitrary stop conditions.

You can also have partially black box models. We use this for Simulator models with SMC sampling that can have arbitrary dynamic code in it.

Of course, as you mentioned, if you go there you may need to do more grunt work to maintain computational performance.

tmbb92 · June 30, 2024, 9:13pm

Does this mean that graphs don’t need to be recompiled anymore when the data changes?

ricardoV94 · June 30, 2024, 9:59pm

They don’t, if your data is defined via pm.Data it’s treated just as a placeholder with a specific number of dimensions but mutable shape (and contents) and any compiled function will take it as input instead of treating it as a constant.

Having said that, PyMC currently does not exploit this and always recompiles a function in subsequent calls to pm.sample or whatever sampling routine. There are plans to avoid doing this. Nutpie already allows you to exploit this by compiling the model once and then letting you update the data without recompiling

tcapretto · July 4, 2024, 9:28am

Sorry for the late reply guys.

Indeed, I did some work like this for a particular project. In that project the covariates were all categorical, and for every combination of them there were multiple observations (in some cases, hundreds of thousands).

I rewrote the logp as this

\log(p(x_1, \dots, x_n \mid \mu, \sigma)) = - n (\log(\sqrt{2 \pi}) + \log(\sigma)) - \frac{1}{2\sigma^2}\sum_{i=1}^{n}{x_i^2} + \frac{\mu}{\sigma^2} \sum_{i=1}^{n}x_i - n \frac{\mu^2}{\sigma^2}

which indeed only depends on the sufficient statistics. If you have multiple groups, you need the sum, sum of squares, and sample size per group.

This can also be done for other distributions in the exponential family.

I also found this resource helpful GitHub - mike-lawrence/stancon_2023_tutorial_hieararchical_modeling with the video here https://www.youtube.com/watch?v=Ns30ngCFArQ&t=2660s&ab_channel=MikeLawrence.

Finally, if you’re doing non-hierarchical gaussian regression with gaussian priors on the regression coefficients and you really care about speed, you may want to check whether you can write your model in such a way that it has a closed-form posterior, removing the need for a sampler.

tmbb92 · July 4, 2024, 10:48am

Yes, this is the whole point of the exponential family, but if you want to go crazy on optimisations you can even use this trick to optimise distributions which are not exponential but in which the logpdf can be expressed as a polynomial of fixed degree on the variable plus some part that can’t. Something like this (there \theta is a vector of parameters):

\log P(x, \theta) = g(x, \theta) + \sum_k^m f_k(\theta)x^k

When you sum over x_i you can use the cached sums of powers of x_i and only actually sum the g(x, \theta). This can reduce the size of your computational graph by a lot.

You can use this for the Weibull distribution, which is useful if you want to do survival analysis for example.

Topic		Replies	Views
Sampling from non-analytical distribution in C++ v5 modeling	11	130	May 4, 2025
Creating your own model in nutpie v5 modeling	6	267	November 6, 2024
Implementing rounding (by manual integration) more efficiently v5	12	331	March 11, 2024
Dynamic shaping, "round" function, JAX, and a "few" more questions v5 modeling , jax , pytensor	28	1575	September 29, 2023
Memory allocation limit for NUTS with custom `logp` function (but not with VI methods) v5 modeling	39	1262	December 10, 2022

Nuts-rs: meaning of the parameters in the CpuLogpFunc

Related topics