Inference with Aesara/PyMC

I am working on Aesara and I would like to know if it is possible to do inference with this library.

Let me explain:

I would like to know if from a sample of data, it is possible to return the parameters of a distribution that best fits the data.
For example: if we have a data set that we want to infer on the Pert distribution (or another distribution), Aesara is able to return the min, mode, max and lambda parameters that best describe the input data.
Typically the code would look something like below.

#Our sample data (which may or may not follow a distribution)
sample = tfd.Pert(10, 30, 40).sample(10000)

#sample = An arbitrary data set

#With PyMC/Aesara
inference(sample, Pert) would return approximately the parameters min=10, mode=30, max=40 and lambda=4

I’m not comfortable with this library (I’m starting on it) and I don’t have much idea how to do this. I know that with Theano and PyMC3, this kind of thing was feasible. I think that it is probably feasible with Aesara with better performance.

Below my code,

import pymc as pm
from scipy.stats import beta as beta_dist
import tensorflow_probability as tfp
 
tfd = tfp.distributions
 
data = tfd.PERT(1., 50., 65.).sample(10_000)
 
def pert_logp_fn(observed, low, peak, high, lmb=4):
    # You need to write the expression of the Pert log-density at observed, given the 3 parameters
    # PyMC does not have this distribution written down
    s_alpha = 1 + lmb*(peak - low)/(high-low)
    s_beta = 1 + lmb*(high - peak)/(high-low)
    x = ((observed - low) / (high-low))
    
    return np.log(beta_dist.pdf(x , s_alpha, s_beta) / (high-low))
 
with pm.Model() as m:
    # Define Flat or more informative priors for the Pert parameters
    low = pm.Flat("low")
    peak = pm.Flat("peak")
    high = pm.Flat("high")
 
    llike = pm.DensityDist("llike", low, peak, high, logp=pert_logp_fn, observed=data)
 
    # Finds the most likely posterior parameters
    map = pm.find_MAP()
 
    # Takes many samples from the posterior parameters
    posterior = pm.sample().posterior

I hope I made it clear, don’t hesitate to come back to me if it’s not the case.

Everything you could do with PyMC v3 + theano you can do with PyMC v4 + aesara. So what you described is certainly possible and your code is quite close. DensityDist has changed with v4, however (I believe).
Maybe @ricardoV94 can chime in, but here is a notebook that might help some. It may be more elaborate than you need, but it might help to get you started.

You cannot use numpy and scipy functions with PyMC, instead you can use functions under pymc.math. Your logp should be defined something like

def pert_logp_fn(observed, low, peak, high, lmb=4):
    # You need to write the expression of the Pert log-density at observed, given the 3 parameters
    # PyMC does not have this distribution written down
    s_alpha = 1 + lmb*(peak - low)/(high-low)
    s_beta = 1 + lmb*(high - peak)/(high-low)
    x = ((observed - low) / (high-low))

    return pm.logp(pm.Beta.dist(s_alpha, s_beta), x) - pm.math.log(high-low)

Note that looking at this logp you should be careful to define priors that make sense, such as high > peak > low > 0, otherwise the sampler and find_MAP will struggle to get good results. You will have to tweak these yourself.

2 Likes

Thank you for your answer,

How can I do a check on the low, peak, high parameters with PyMc? I don’t know the library well enough.

Because I imagine that “if” are not enough .

Thank in advance

You can use pymc.math.switch, which works like numpy.where.

You can use this helper we use internally in PyMC to add constraints to the parameters (ultimately yielding -inf logp if unmet): pymc/dist_math.py at main · pymc-devs/pymc · GitHub

Example use: pymc/continuous.py at main · pymc-devs/pymc · GitHub

Those at.foo are the same you get via pymc.math.foo. The latter is just a (restricted) alias to the former.

Does this look right? I also have a problem converting the parameters to tensor I think

import pymc as pm
from scipy.stats import beta as beta_dist
import tensorflow_probability as tfp
from pymc.distributions.dist_math import check_parameters
import pytensor
import pytensor.tensor as at

tfd = tfp.distributions
data = tfd.PERT(1., 50., 65.).sample(10_000)
data = tfd.PERT(1., 50., 65.).sample(10_000)

def pert_logp_fn(observed, low, peak, high, lmb=4):
    # You need to write the expression of the Pert log-density at observed, given the 3 parameters
    # PyMC does not have this distribution written down
    s_alpha = 1 + lmb*(peak - low)/(high-low)
    s_beta = 1 + lmb*(high - peak)/(high-low)
    x = ((observed - low) / (high-low))
    res = at.switch(
        at.lt(at.as_tensor_variable(low), peak) and at.lt(peak, high),
        pm.logp(pm.Beta.dist(s_alpha, s_beta), x) - pm.math.log(high-low),
        -np.inf,
    )
    return check_parameters(
        res,
        low < peak,
        peak < high,
        msg="lower <= c <= upper",
    )

with pm.Model() as m:
    # Define Flat or more informative priors for the Pert parameters
    low = pm.Flat("low")
    peak = pm.Flat("peak")
    high = pm.Flat("high")
    llike = pm.DensityDist("llike", low, peak, high, logp=pert_logp_fn, observed=data)
    # Finds the most likely posterior parameters
    map_ = pm.find_MAP()
    # Takes many samples from the posterior parameters
    posterior = pm.sample().posterior

Sorry for the identation, I can’t keep the identification

Yes that seems about right. You will need better priors than pm.Flat to respect your parameter constraints.

Ok thank you.

I have this error, do you know how to resolve it ? I try to test with at.scalar without success.

You have to use at.and_, cannot use Python’s and (just like with NumPy). You might also have been mixing PyTensor with Aesara. Unless you are using the developer version of PyMC, you should still import Aesara!

Here is a Google Colab with the implementation

You don’t need the switch because the Beta logp and the check_parameters already do that

I tweaked the priors so they make more sense, but they are still not perfect (it’s still possible a-priori for the peak to be higher than high). As you can see NUTS has some divergences, but it does seem to be getting somewhere reasonable.

find_MAP seems to be stuck at the initial point (the estimates coincide with the mean of the prior), perhaps because it’s starting at a point where the gradients are too flat.

Note that I did not confirm if the original logp definition was correct at all. I assume it was.

I think there is an error in the logp. Indeed, above my return np.log(beta_dist.pdf(x , s_alpha, s_beta) / (high-low)) has been replaced by logp = pm.logp(pm.Beta.dist(s_alpha, s_beta), x) - pm.math.log(high-low).

The division of the range became the log of the range in the second part of the expression.

Anyway, thank you very much for your help and patience. I will study in detail this Notebook to try to understand how it works

I was exploiting the property that log(a / b) == log(a) - log(b) so that should be fine (and more stable)

Ok, thank you