Creating your own model in nutpie

Hello everyone! Can I generate a sample using the NUTS in nutpie if I do not know the analytical density and distribution functions?
I can numerically calculate the value of the density function and the value of its gradients at any point. I would like to implement the NUTS algorithm using nutipe to generate a sample from an this unknown distribution and estimate its parameters.

I read the discussion of this issue on the Stan form (Fitting external models with Stan - Modeling - The Stan Forums ), I also found an example code there. However, when I tried to test some simple model on it, I ran into a problem. For some reason, the function stops working adequately for variance values far from 1.
Here is an example of my code:

import nutpie
from nutpie.compiled_pyfunc import from_pyfunc

def log_multinormal_distribution(x : np.ndarray):
    mean = np.array([.0])
    cov = np.array([[3]])
    return -1/2* (x - mean).T @ np.linalg.inv(cov) @ (x - mean), -cov @ (x - mean)

def make_logp_func():
    def logp(x : np.ndarray
             , **unused):
        return log_multinormal_distribution(x)
    return logp

def make_expand_func(*unused):
    def expand(x, **unused):
        return {"y": x}
    return expand

# sample
model = from_pyfunc(1, make_logp_func, make_expand_func, [np.float64], [(1,)], ["y"],)
fit = nutpie.sample(model, draws=10000, tune=3000)
sample = fit.posterior.y

If you use the standard PyMC model, then everything will turn out great, here is an example:

import pymc as pm
import numpy as np
import nutpie

with pm.Model() as simple_model:
    norm = pm.Normal('norm', 0, 100)
compiled_model = nutpie.compile_pymc_model(simple_model)
trace_pymc = nutpie.sample(compiled_model)

Since I’m new to rust and couldn’t find the full documentation for nutpie, the error may be very stupid…

What error do you get? And cc @aseyboldt

When creating your own model and running it, the sampler takes much longer to calculate the result than with the PyMC model. In addition, the sampling result is much worse. The average value of the sample is far from the theoretical one. The algorithm often relies on maxdepth=1024.
This does not happen with the standard PyMC model.

That’s usually what happens if the gradient is incorrect. (With a correct gradient it will probably still be a bit slower as with pymc, we have some python ovearhead for python logp functions).

It should be

-linalg.inv(cov) @ (x - mean)

Thank you very much! I really did not differentiate the matrix expression correctly

In my experience hand coded gradients are always incorrect on the first attempt. :slight_smile:
Checking against numerical differentiation is really useful.

1 Like

matrixcalculus.org and wolframalpha.com are your friends if you need analytical gradients.

2 Likes