Sampling from non-analytical distribution in C++

miau-murk · April 21, 2025, 7:47pm

Hello everyone,

I’m trying to solve the problem of sampling from a non-analytical distribution in C++. I have a .cpp function that calculates the energy and its gradient at a point, and I want to generate samples using the NUTS algorithm. I have already done this in Rust and Python with nuts-rs and nutpie.

I know that I can create a bridge between Rust and C++, but my task is very specific, and I can’t use any non-C++ features, such as cargo or just Python.

So, how can I solve this problem? Are there any C++ wrappers for nuts-rs or other useful programs?

Thanks for your help!

bob-carpenter · April 21, 2025, 8:36pm

Is the requirement just that top-level interface is in C++ or that the entire pipeline is C++? I didn’t understand how you could build a bridge between Rust and C++ if you can’t use any non-C++ features. If it’s only that you need a C++ interface, the simplest thing to do might be to close and modify @aseyboldt’s interface design for PyMC and bridgestan.

Alternatively, Stan (GitHub repo stan-dev/stan) is 100% C++. You can use Stan directly from C++ with the services layer in stan-dev/stan, though this is not well documented as most of our users are not developers. Stan’s CmdStan command-line interface (repo stan-dev/cmdstan) is also 100% C++. But you can’t plug in generic C++ through CmdStan, only models coded in Stan. We haven’t implemented the real slick warmup from nutpie, though, so I’d recommend nutpie if that satisfies your requirements.

miau-murk · April 22, 2025, 8:01am

Essentially, the task is scientific in nature, so the most important thing is to achieve a solution. Ideally, the user should interact as little as possible with programs unrelated to C++. In particular, my colleagues dislike that when using @aseyboldt nust-rs , it’s necessary to download the Cargo compiler and then integrate Rust code into the main code. The same applies to nutpie.

Although I recognize that this is likely the simplest solution to the problem, because I cannot implement such a specialized model in Stan, as my C++ function computes the energy and gradient numerically.

bob-carpenter · April 22, 2025, 3:46pm

By “non-analytical” and “numerical”, do you mean it requires a non-standard embedded iterative solver of some kind (i.e., not just an ODE solver, root finder, 1D integrator, etc.)? Even if you need an iterative algorithm, If the energy is continuously differentiable and can be defined in terms of elementary distributions and transforms and matrix operations, you should be able to code it in PyMC and let that take care of the derivatives.

miau-murk · April 22, 2025, 4:24pm

And what if I have a code that I cannot translate in Python? C++ function is given to me as a compiled file (moreover in fact it’s written on fortran), so I calculated energy using bin file.

jessegrabowski · April 22, 2025, 5:17pm

You can call compiled FORTRAN/C++ code within python (this happens every time you use numpy/BLAS/LAPACK) using e.g. ctypes.

If your dependency requirements are extremely strict, it sounds like it would be easier to find a project that implements a sampler natively in C++ ?

miau-murk · April 22, 2025, 6:01pm

The dependency requirements are indeed quite strict. I couldn’t find anything similar to nuts-rs in C++, so I asked the question here. If you know some good C++ samplers, maybe you can make an advice for me?

Right now, it seems to me that using nutpie remains the only optimal solution.

bob-carpenter · April 22, 2025, 7:45pm

What are the dependency requirements? Does it have to be 100% pure C++ of some version? Can you use libraries like Eigen or Boost, or is strict no-dependency C++?

If this has to be a 100% pure C++ project and you only need to be able to fit this one model and can put some time/compute into tuning the tuning parameters, then I’d suggest directly coding the multinomial version of HMC.^* It’s a rejection-free algorithm that also solves the nasty harmonics problem of vanilla HMC. It’s not as efficient as NUTS, but it’s not that far off once you have a reasonable step size and max number of steps.

It it doesn’t have to be pure C++, then you can do what @jessegrabowski suggests and embed the C++ definition in a foreign function interface through ctypes. It’s maybe a bit easier than writing multinomial HMC but not much.

I don’t know any existing C++ implementations of something like NUTS that are detached from probabilistic programming frameworks and don’t have dependencies into things

^* randomly split total number of steps L into F + B, where F are forward steps and B backward steps, then select a point proportional to its negative exponentiated energy.

miau-murk · April 22, 2025, 8:46pm

Right now, it seems to me that using nutpie with ctypes remains the only optimal solution.

Thank you all for your help and the time you’ve spent on my problem!

aseyboldt · April 24, 2025, 2:26pm

Before spending a lot of time trying to sample in C++, I’d make very sure that just using pymc, stan or numpyro etc really isn’t an option. Other options come with quite a few challenges. Not just sampling itself and all the build and dependency issues that might come up, but you also need to figure out how to access sampler statistics, compute effective sample size etc. I’m sure that is all doable, but a lot of work. It of course depends a lot on the specifics, but this could quickly become a lot harder to write and maintain than rewriting a density.

But assuming you are sure that that’s what you need:

Again, not exactly what you seem to want, but I heard pretty good things about corrosion-rs and/or cxx to include rust dependencies into c++ cmake projects (I’ve never used it myself though). You could use that to directly call nuts-rs from the C++ code, and at least avoid python. That doesn’t solve any issues with convergence checking etc, but at least you could sample. Users would still need to have rustup installed though…
As Bob already mentioned, isn’t the stan library just what you are looking for? A C++ library that implements the sampler, given a function that computes the log density and its gradient. It does come with a couple of dependencies itself though (for instance boost), so including that into a larger code base might not be trivial.

bob-carpenter · April 28, 2025, 9:23pm

I didn’t mean to suggest that! Stan doesn’t make it so easy to plug in a new density. It has to be set up to extend our abstract base model class, which was designed explicitly with Stan in mind. And it’s a monstrously large and angled project with a lot else going on. Even the simple sampler is spread over two repositories, stan-dev/stan and stan-dev/math.

Having said that, I just started a project to write a standalone version of NUTS in C++ to try out releasing code in the same way that @aseyboldt released Nutpie—the goal’s to follow it with the auto-step size tuning version. I should be able to write a more performant version than the one we have in Stan and release it in such a way it’ll work with both PyMC and Stan models.

rosgori · May 4, 2025, 1:01am

I wonder what non-analytical distribution in C++ looks like. Just for fun, I did something similar. What you see is done in Windows 11 with the help of Gemini 2.5 Flash. By the way, I don’t know anything about C++.

The first step is to show a known distribution, I chose the negative binomial distribution with a parametrization following this:

\text{NegBinomial}(n \, | \, \mu, \alpha) = \binom{\, n + \alpha - 1}{n} \, \left( \dfrac{\mu}{\mu+\alpha} \right)^{\!n} \, \left( \dfrac{\alpha}{\mu+\alpha} \right)^{\!\alpha} \!.

As you may know, you have to find the logarithm of that expression and then the derivatives of the logarithm of the distribution:

\log \text{NegBin} (n \, | \, \mu, \alpha) = \text{some expression} \\ \dfrac{\partial \log \text{NegBin} }{\partial \mu} = \text{some expression} \\ \dfrac{\partial \log \text{NegBin} }{\partial \alpha} = \text{some expression}

Well, let’s write the result for this in C++, call the file neg_binomial.cpp:

#include <cmath>    // Para log, lgamma
#include <vector>   // Para devolver el gradiente como vector
#include <limits>   // Para manejar valores muy pequeños/grandes
#include <iostream> // Para posibles mensajes de error/depuración

// Incluir la función digamma. Esta es la parte que requiere una librería externa.
// Aquí asumimos que usamos Boost.Math. Si no tienes Boost, esta parte fallará
// y necesitarías encontrar otra implementación de digamma o usar una aproximación
// (lo cual no es ideal para gradientes precisos).
#ifdef USE_BOOST
#include <boost/math/special_functions/digamma.hpp>
#endif


// Función para calcular el logaritmo de la PMF de la Distribución Binomial Negativa
// Parametrización: k (observación), mu (media), alpha (dispersión)
// k: entero no negativo (número de fracasos observados)
// mu: real positivo (media esperada)
// alpha: real positivo (parámetro de dispersión)
double logp_neg_binomial(int k, double mu, double alpha) {
    // Validar entradas básicas (PyMC suele manejar esto con restricciones en los parámetros,
    // pero es bueno tener validación básica).
    if (mu <= 0.0 || alpha <= 0.0) {
         // Retornar -infinito o un valor muy pequeño si los parámetros son inválidos
        return -std::numeric_limits<double>::infinity();
    }
    if (k < 0) {
         // k debe ser no negativo
        return -std::numeric_limits<double>::infinity();
    }

    // Calcular log(P(k; mu, alpha))
    // Fórmula: lgamma(k + alpha) - lgamma(k+1) - lgamma(alpha) + alpha log(alpha) + k log(mu) - (k + alpha) log(mu + alpha)

    double log_gamma_term = 0.0;
    if (k + alpha > 0 && alpha > 0) { // lgamma requiere argumento positivo
        log_gamma_term = std::lgamma(k + alpha) - std::lgamma(k + 1.0) - std::lgamma(alpha);
    } else if (k == 0 && alpha == 0) {
        // Caso especial, log(Binomial(k-1+alpha, k)) = log(Binomial(-1, 0))
        // Para k=0, Binomial(alpha-1, 0) = 1 si alpha-1 >= 0
        // Si k=0, el término es lgamma(alpha) - lgamma(1) - lgamma(alpha) = 0
         log_gamma_term = 0.0; // log(1) = 0
    } else {
        // Otros casos inválidos, aunque mu>0, alpha>0, k>=0 deberían cubrirlos
         return -std::numeric_limits<double>::infinity();
    }


    double log_prob = log_gamma_term
                      + alpha * std::log(alpha)
                      + k * std::log(mu)
                      - (k + alpha) * std::log(mu + alpha);

    return log_prob;
}

// Función para calcular el gradiente del logaritmo de la PMF
// con respecto a los parámetros mu y alpha.
// Retorna un vector<double> donde el primer elemento es el gradiente w.r.t. mu
// y el segundo elemento es el gradiente w.r.t. alpha.
// k: entero no negativo
// mu: real positivo
// alpha: real positivo
std::vector<double> grad_neg_binomial_logp(int k, double mu, double alpha) {
    std::vector<double> gradients(2); // gradients[0] para mu, gradients[1] para alpha

    // Validar entradas
    if (mu <= 0.0 || alpha <= 0.0 || k < 0) {
        gradients[0] = std::numeric_limits<double>::quiet_NaN(); // Not a Number
        gradients[1] = std::numeric_limits<double>::quiet_NaN();
        return gradients;
    }

    // Gradiente con respecto a mu: k/mu - (k + alpha)/(mu + alpha)
    gradients[0] = (double)k / mu - (double)(k + alpha) / (mu + alpha);

    // Gradiente con respecto a alpha: psi(k + alpha) - psi(alpha) + log(alpha) - log(mu + alpha) + 1 - (k + alpha)/(mu + alpha)
    // Requiere la función digamma (psi)
#ifdef USE_BOOST
    double digamma_k_alpha = boost::math::digamma(k + alpha);
    double digamma_alpha = boost::math::digamma(alpha);

    gradients[1] = digamma_k_alpha - digamma_alpha
                + std::log(alpha) - std::log(mu + alpha)
                + 1.0 - (double)(k + alpha) / (mu + alpha);
#endif

    return gradients;
}

// NOTA: Para probar estas funciones, podrías añadir una función main simple aquí,
// pero para la integración con Python no se necesita main.

// #include <iostream> // Para posibles mensajes de error/depuración
int main() {
    int k = 10;
    double mu = 3;
    double alpha = 2; // Equivalente a r=2, p = 2/(3+2) = 2/5 = 0.4

    double log_prob = logp_neg_binomial(k, mu, alpha);
    std::cout << "Log-Prob NB(k=" << k << ", mu=" << mu << ", alpha=" << alpha << ") = " << log_prob << std::endl;

    std::vector<double> grad = grad_neg_binomial_logp(k, mu, alpha);
    std::cout << "Gradiente w.r.t mu = " << grad[0] << std::endl;
    std::cout << "Gradiente w.r.t alpha = " << grad[1] << std::endl;

    return 0;
}

I left the comments in Spanish because Gemini gave me all of that. The function main is just to check the code. You can comment that.

Since I have to compile this code to check, I need g++. I installed Rtools45 and I added these folders to PATH in Windows:

C:\rtools45\x86_64-w64-mingw32.static.posix\bin
C:\rtools45\usr\bin

I also need to download the Boost library. The command to compile is:

g++ -v neg_binomial.cpp -o neg_binomial.exe -fPIC -std=c++17 -D USE_BOOST -I "C:\Users\myuser\Downloads\boost_1_82_0"

To use the functions with Python, Gemini recommended pybind11, call the file neg_binomial_bindings.cpp:

#include <pybind11/pybind11.h>
#include <pybind11/stl.h> // Necesario para convertir std::vector a listas de Python

// Incluye el archivo C++ que contiene las implementaciones de las funciones.
// En un proyecto grande, lo ideal sería incluir un archivo .h o .hpp con las declaraciones.
// Para este ejemplo simple, incluimos el .cpp directamente.
#include "neg_binomial.cpp"

namespace py = pybind11;

// PYBIND11_MODULE es una macro que crea el punto de entrada del módulo Python.
// El primer argumento (neg_binomial_ext) es el nombre que tendrá el módulo cuando lo importes en Python (import neg_binomial_ext).
PYBIND11_MODULE(neg_binomial_ext, m) {
    // Opcional: añadir un docstring al módulo
    m.doc() = "Binding de Python para las funciones de log-probabilidad y gradiente de la Binomial Negativa en C++";

    // Exponer la función logp_neg_binomial
    // m.def("nombre_en_python", &nombre_en_c++, "docstring")
    m.def("logp", &logp_neg_binomial,
          "Calcula el logaritmo de la PMF de la Binomial Negativa (parametrización mu, alpha).");

    // Exponer la función grad_neg_binomial_logp
    // pybind11/stl.h permite que std::vector<double> se convierta automáticamente en una lista de Python.
    m.def("grad", &grad_neg_binomial_logp,
          "Calcula el gradiente del log-PMF de la Binomial Negativa con respecto a mu y alpha.");

    // Opcional: Podrías exponer otras funciones si tuvieras (ej: cdf, invcdf, etc.)

To compile, the command is:

g++ -v -shared neg_binomial_bindings.cpp -o neg_binomial_ext.pyd `
-I "C:\Users\myuser\anaconda3\Include" `
-I "C:\Users\myuser\anaconda3\Lib\site-packages\pybind11\include" `
-DUSE_BOOST -I "C:\Users\myuser\Downloads\boost_1_82_0" `
-fPIC -std=c++17 `
-L "C:\Users\myuser\anaconda3\libs" -l python312

The flags could change if you are in Linux or macOS. The result is a file called neg_binomial_ext.pyd, this is the file to be used in pymc.

If you want to check that you can use the function, use this:

Click to see the code

Comments and message in Spanish

# test_neg_binomial_ext.py

import sys
import os

# Añade el directorio actual al PATH de Python para que pueda encontrar el módulo compilado
# si este script no se ejecuta en el mismo directorio del archivo .pyd/.so
# Aunque si ejecutas el script en el mismo directorio, esta línea no es estrictamente necesaria,
# es una buena práctica si el módulo está en un subdirectorio, por ejemplo.
# current_dir = os.path.dirname(os.path.abspath(__file__))
# if current_dir not in sys.path:
#     sys.path.append(current_dir)

# Importa el módulo compilado
# El nombre 'neg_binomial_ext' debe coincidir con el nombre que le diste en PYBIND11_MODULE
try:
    import neg_binomial_ext as nb_ext
    print("Módulo 'neg_binomial_ext' importado exitosamente.")
    print(f"Ubicación del módulo: {nb_ext.__file__}") # Muestra la ruta del archivo .pyd/.so que cargó

except ImportError as e:
    print(f"Error al importar el módulo: {e}")
    print("Posibles causas:")
    print("1. El archivo 'neg_binomial_ext.cp312-win_amd64.pyd' (o la extensión correcta para tu sistema/versión de Python)")
    print("   no está en el mismo directorio que este script.")
    print("2. El nombre del archivo .pyd/.so no coincide con el nombre del módulo especificado en PYBIND11_MODULE.")
    print("3. Hubo un error durante la compilación/enlazado que impidió que el módulo fuera cargado.")
    sys.exit(1) # Salir si no se puede importar

# --- Ahora puedes usar las funciones expuestas desde C++ ---

# Define algunos valores de entrada para la Distribución Binomial Negativa
# (parametrización mu, alpha)
# k: número de fracasos observados (entero no negativo)
# mu: media esperada (real positivo)
# alpha: parámetro de dispersión (real positivo)

k_observado = 10
mu_param = 3
alpha_param = 2

print(f"\nProbando con k={k_observado}, mu={mu_param}, alpha={alpha_param}")

# Llama a la función logp que expusimos desde C++ (con el nombre "logp")
try:
    log_probabilidad = nb_ext.logp(k_observado, mu_param, alpha_param)
    print(f"Log-probabilidad calculada en C++ (función 'logp'): {log_probabilidad}")
except Exception as e:
    print(f"Ocurrió un error al llamar a la función 'logp': {e}")


# Llama a la función de gradiente que expusimos desde C++ (con el nombre "grad")
try:
    vector_gradiente = nb_ext.grad(k_observado, mu_param, alpha_param)
    print(f"Vector gradiente calculado en C++ (función 'grad'): {vector_gradiente}")

    # La función grad retorna un std::vector<double> que pybind11 convierte automáticamente a una lista de Python
    if isinstance(vector_gradiente, list) and len(vector_gradiente) == 2:
        print(f"  Gradiente con respecto a mu: {vector_gradiente[0]}")
        print(f"  Gradiente con respecto a alpha: {vector_gradiente[1]}")
    else:
         print("  Tipo o tamaño inesperado del valor retornado por la función 'grad'.")

except Exception as e:
    print(f"Ocurrió un error al llamar a la función 'grad': {e}")

This part is the hardest: How to write the the code with pytensor and the function we just created? I don’t pretend that I understand the code, I have been reading the documentation for pytensor and this example to improve the code. Comments in Spanish.

import pymc as pm
import pytensor
import pytensor.tensor as pt
from pytensor.graph import Apply, Op
import numpy as np
import sys
import arviz as az
import matplotlib.pyplot as plt

try:
    import neg_binomial_ext as nb_ext
    print("Módulo C++ neg_binomial_ext importado correctamente para PyTensor Op.")
except ImportError:
    print("Error: No se pudo importar el módulo 'neg_binomial_ext'. Asegúrate de que el archivo .pyd/.so esté en el PYTHONPATH.")
    sys.exit(1)

class NegBinomialLogPGradientOp(Op):
    __hash__ = Op.__hash__
    def __eq__(self, other):
        return type(self) == type(other)

    def make_node(self, k_pt, mu_pt, alpha_pt):
        k_pt = pt.as_tensor_variable(k_pt, dtype='int32')
        mu_pt = pt.as_tensor_variable(mu_pt, dtype='float64')
        alpha_pt = pt.as_tensor_variable(alpha_pt, dtype='float64')

        # Verificaciones de tipo temporalmente quitadas

        outputs = [pt.vector(dtype='float64')]

        return Apply(self, [k_pt, mu_pt, alpha_pt], outputs)

    def perform(self, node, inputs, outputs):
        k, mu, alpha = inputs
        gradients_list = nb_ext.grad(int(k), float(mu), float(alpha))

        if not isinstance(gradients_list, list) or len(gradients_list) != 2:
             raise ValueError(f"La función nb_ext.grad retornó un formato inesperado: {gradients_list}")

        outputs[0][0] = np.array(gradients_list, dtype='float64')

    def grad(self, inputs, output_gradients):
         # CORRECCIÓN: Retornar ceros flotantes para todos los inputs (no se calcula Hessiano)
         # inputs son [k_pt, mu_pt, alpha_pt]
         return [pt.zeros_like(inputs[0], dtype='float64'), # Gradiente w.r.t k_pt
                 pt.zeros_like(inputs[1], dtype='float64'), # Gradiente w.r.t mu_pt
                 pt.zeros_like(inputs[2], dtype='float64')] # Gradiente w.r.t alpha_pt


class NegBinomialLogPOp(Op):
    __hash__ = Op.__hash__
    def __eq__(self, other):
        return type(self) == type(other)

    def make_node(self, k_pt, mu_pt, alpha_pt):
        k_pt = pt.as_tensor_variable(k_pt, dtype='int32')
        mu_pt = pt.as_tensor_variable(mu_pt, dtype='float64')
        alpha_pt = pt.as_tensor_variable(alpha_pt, dtype='float64')

        # Verificaciones de tipo temporalmente quitadas

        outputs = [pt.scalar(dtype='float64')] # Esta Op ES escalar

        return Apply(self, [k_pt, mu_pt, alpha_pt], outputs)

    def perform(self, node, inputs, outputs):
        k, mu, alpha = inputs
        logp_value = nb_ext.logp(int(k), float(mu), float(alpha))
        outputs[0][0] = np.array(logp_value, dtype='float64')

    def grad(self, inputs, output_gradients):
        k_pt, mu_pt, alpha_pt = inputs
        cost_grad = output_gradients[0]

        nb_grad_op = NegBinomialLogPGradientOp()
        gradients_pt_vector = nb_grad_op(k_pt, mu_pt, alpha_pt)

        d_logp_dmu_pt = gradients_pt_vector[0]
        d_logp_dalpha_pt = gradients_pt_vector[1]

        grad_mu = cost_grad * d_logp_dmu_pt
        grad_alpha = cost_grad * d_logp_dalpha_pt

        # CORRECCIÓN: Retornar un cero flotante simbólico para el input entero k_pt
        grad_k = pt.zeros_like(k_pt, dtype='float64')

        # La lista de retorno debe corresponder a los inputs: [grad_wrt_k, grad_wrt_mu, grad_wrt_alpha]
        return [grad_k, grad_mu, grad_alpha]


# --- Usar la Op personalizada en un Modelo PyMC ---

# Crear una instancia de la Op de log-probabilidad (la plantilla escalar)
neg_binomial_logp_op = NegBinomialLogPOp()

# Datos observados (como array NumPy)
observed_data_k = np.array([6, 9, 1, 3, 13, 8, 20, 11, 8, 16, 2, 11, 7, 6, 10, 14, 7, 45, 2, 
                            3, 3, 17, 6, 11, 3, 16, 22, 4, 15, 8, 3, 3, 4, 0, 7, 1, 0, 5, 3, 
                            5, 1, 1, 0, 1, 0, 8, 2, 15, 13, 4], dtype='int32')

if __name__ == '__main__':

    with pm.Model() as custom_nb_model:
        mu = pm.HalfNormal("mu", sigma=10)
        alpha = pm.HalfNormal("alpha", sigma=10)

        # Lista para almacenar los nodos simbólicos de log-probabilidad por cada punto de dato
        logp_terms_list = []

        # CORRECCIÓN: Construir el grafo manualmente iterando sobre los datos
        for i, k_i in enumerate(observed_data_k):
            # Convertir cada punto de dato escalar a una constante simbólica de PyTensor
            k_i_pt = pt.constant(k_i, dtype='int32')

            # Aplicar la Op escalar a este punto de dato escalar y a los parámetros escalares (priors)
            logp_i = neg_binomial_logp_op(k_i_pt, mu, alpha)

            # Añadir el resultado simbólico a la lista
            logp_terms_list.append(logp_i)

        # Concatenar todos los resultados simbólicos en un solo vector simbólico
        logp_term_vector = pt.stack(logp_terms_list)

        # Sumar los log-términos para obtener la log-verosimilitud total
        total_logp = pt.sum(logp_term_vector)

        # Añadir el total_logp al log-verosimilitud total del modelo usando pm.Potential
        logp_likelihood = pm.Potential("logp_likelihood_from_cpp", total_logp)

        print("Construcción del grafo completada. Iniciando muestreo con NUTS...")
        # pm.sample() por defecto intenta usar el muestreador NUTS si hay gradientes disponibles.
        idata = pm.sample(2500)

    print("\nAnálisis de resultados:")
    print(az.summary(idata))

    # Mostrar los gráficos de traza y distribución posterior
    az.plot_trace(idata)
    az.plot_posterior(idata)

    # Mostrar los gráficos generados por matplotlib
    plt.show()

Once again, I don’t know if this is something similar to what you are looking for.

Topic		Replies	Views
Nuts-rs: meaning of the parameters in the CpuLogpFunc version agnostic nutpie	25	188	July 4, 2024
Creating your own model in nutpie v5 modeling	6	242	November 6, 2024
Compatibality with C/C++ wrapped functions Questions	3	1763	March 30, 2018
Dynamic shaping, "round" function, JAX, and a "few" more questions v5 modeling , jax , pytensor	28	1552	September 29, 2023
Implementing rounding (by manual integration) more efficiently v5	12	319	March 11, 2024

Sampling from non-analytical distribution in C++

Related topics