LOO/WAIC metrics from pm.Potential based model?

simon_o · October 5, 2018, 7:45pm

I’m running a model inspired from the arma_example.py (https://github.com/pymc-devs/pymc3/blob/master/pymc3/examples/arma_example.py) code, using pm.Potential.

I’d very much like to get the LOO/WAIC metric for my model, to compare it with some other model, however when I try to do that it says “The model does not contain observed values”. Is there a workaround for this?

junpenglao · October 6, 2018, 12:09pm

Depending on how you implement the potential, if it is similar to DensityDist, you should rewrite it into a custom distribution. If it does not evaluated on the observed easily, it would be difficult. Recall that LOO/WAIC are not straightforward with multivariate and multi observed.

simon_o · October 6, 2018, 1:25pm

Ok, so by using DensityDist instead of Potential I would be able to use observed. This implies defining a custom likelihood function of the form:

def likelihood(value)

where value is a scalar that corresponds to every single value, passed one at a time to the function, from my observed data. Correct?

Now, because I’m interested in auto-regressive processes, I’d like to calculate the likelihood based on an arbitrary number of lags at the same time (I suppose – unless there is some math I’m missing that tells me I can do this without requiring all the lags of my AR(MA) process at once). Is that possible with a DensityDist and a custom likelihood function? In general is it possible to define custom timeseries-like probability distributions with DensityDist that rely on past lag values?

junpenglao · October 6, 2018, 3:31pm

Yes, ideally your function returns a logp for each value (or each slice of value if it is a multi-dimensional RV), pymc3 will compute the sum of logp internally.

If you have a set of number of lags (e.g., you want to compute lag=0, lag=1, … lag=5), the easiest I can think of is to have 6 different logp and then sum them together. If the number of lag is a random variable or can change dynamically during inference, then things becomes much more difficult.

simon_o · October 7, 2018, 2:15pm

Ok, I don’t know if I correctly understood what you are proposing, but I gave the following a try, and it seems to work (that is, it gives the same estimates as my version with pm.Potential). It’s basically an attempt at writing my own AR(1) model:

def build_model():
y1 = shared(np.array([10, 13, 11, 14, 11, 12, 9, 11, 10, 12], dtype=np.float32))
y2 = shared(np.array([15, 10, 13, 11, 14, 11, 12, 9, 11, 10], dtype=np.float32))

def likelihood(y1, y2):
    nu_t = mu + phi * y2
    err = y1 - nu_t

    return pm.Normal.dist(0, sd=sigma).logp(err)

with pm.Model() as ar_model:
    sigma = pm.HalfNormal('sigma', 5.)
    phi = pm.Normal('phi', 0., sd=2.)
    mu = pm.Normal('mu', 0., sd=10.)

    pm.DensityDist('like', likelihood, observed={'y1': y1, 'y2': y2})

return ar_model

So by pre-lagging the observations and passing each lag of the original vector y separately to the likelihood function I’m able to write logps that reference all lags as part of the calculation. Does this code make sense? The part I’m not clear about is why is my likelihood function able to reference phi, mu, and sigma even though it is declared outside of the model – is that supposed to be the case?

Topic		Replies	Views
Pointwise log-likelihood for loo/waic with custom "black-box" probability distribution v5	9	733	January 19, 2024
Log-likelihood not found in InferenceData when model as pm.Potential in PYMC v5	3	409	March 27, 2024
Calculating WAIC for models with multiple likelihood functions Questions	20	4496	October 26, 2020
How to define likelihood by myself in Pymc3	4	687	October 30, 2023
Explanation of logp and customDist for Blackbox likelihood functions v5	4	402	February 2, 2024

LOO/WAIC metrics from pm.Potential based model?

Related topics