Any experience with reLOO?

I notice that ArViZ has a reloo function but I haven’t found any PyMC3 examples using it. Has anyone had success with this? I know its flagged as “experimental” right now.

I want to compare models that have 1-2 data points with LOO pareto shapes > 0.7. It would be great to calculate these LOO point without PSIS.

1 Like

It would be great if you could help testing it out. Right now there is only one example available for PyStan on the website but trying it out on real models would be amazing.

I have recently refocused on this a little with hopes to expanding to PSIS-LFO and SMC, and I started working on the PyMC3 wrapper too on this PR. It would be great if you could checkout the branch and try it. I’ll rebase it with master.

Getting it to work with PyMC3 has its issues, but it can be sorted out. The PR actually has 2 proposals that try to abstract the process as much as possible, feedback on the general API and on how these 2 alternatives compare will be most appreciated. I still want to try a 3rd option using symbolic_pymc though, I hope I’ll have time within the following 2 weeks.

I personally prefer the alternative that relies more on xarray instead of the one that uses pymc3 more intensively as I think it requires less mental overhead to use. Here is a link to the notebook which I think you’ll be able to use modifying only the sel_observations method.

The link to the second alternative is here. Unlike in the previous case, here there is no need to rewrite with numpy the pointwise log likelihood formula, but the resulting class may be slightly harder to generalize.

Let me know if you need help adapting this to your models

2 Likes

Also, the wrapper is currently thought to recompile the model every time a refit is needed because when the shape of the observations is changed the model needs to be recompiled. With LFO every refit has changes on the shape of the observations but for reloo this is not the case, if your model were really slow to compile we could look into this but at least one recompilation will be needed. With 1-2 datapoints to which reloo needs to be applied it may not be worth it to go to all this trouble.

There is an example of reloo usage with ArviZ now on one issue: https://github.com/arviz-devs/arviz/issues/1404. I though it could be interesting to people reading this.

2 Likes

I’ve looked through your examples a couple of times now, and I’m still lost. Can’t figure out how to actually use reloo.

What kind of model are you trying to use? Are the issues creating the sampling wrapper or using the wrapper to call reloo?

Hi, can you relink the notebook you are referring to? It says 404 not found and I have the same problem as sammosummo that I don’t know how to adjust / use the wrapper: I don’t know were to modify the example wrapper class and how to find the correct arguments that need to be passed? Do I need to write a custom log likelihood function in any case? How does it work with the coords and dimensions? I need to pass thee coords with

with pm.Model() as model:
    model.add_coord('obs_id',obs_id, mutable=True)

because I need to predict out of sample outcomes in the next step. In the post it says " Note that for reloo you should not use coordinate values (thus not coords argument passed to pm.Model nor to from_pymc3) because each iteration would have different values for them causing the program to error out." - but I don’t get what that means - could you elaborate?

Hi. OriolAbril kindly helped me a lot with reLOO few months ago. You can see this thread:
https://discourse.pymc.io/t/reloo-is-using-average-log-likelihoods-valid/10089
Maybe you’ll find it useful. If I remember correctly, there are several useful links in the thread that you could use as tutorials.

It was merged a while ago, it is now on arviz docs (but still runs on v3, not v4): https://python.arviz.org/en/latest/user_guide/pymc3_refitting.html.

Wheb doing the refits with reloo the shape of the observed data also changes, so you should either not use coords (and use dims only) or update the coords in the model as part of the sel_observations or sample methods

Thank you, I already had a look at that example but I am working with pymc 4 - is there an equivalent to “az.from_pymc3()” and other attributes like “idata__i.pymc3_trace”?

What exactly is the difference between coords and dims? Don’t I need to specify coords to then pass dims?

Do you mean within the wrapper I would need to update the coords similar to what I do with oos-preds?

Sth along the lines of

def sample(self, modified_observed_data):
        with self.model(**modified_observed_data) as model:
        model.set_data('x', x, coords = {'obs_id': obs_id})
        model.set_data('c', c, coords = {'participants': participants})
            trace = pm.sample(
                **self.sample_kwargs, 
                return_inferencedata=False, 
                idata_kwargs={"log_likelihood": False}
            )
        self.pymc4_model = model
        return trace

It is still an open PR but this should help: Refitting PyMC models with ArviZ — ArviZ dev documentation (note the 2158 (PR) in the flyout at the bottom right of the page). There were also a couple things outdated in reloo, so if you want to try it before the PR is merged you’ll need to install arviz from the specific PR: Updates to wrappers and refitting stats by OriolAbril · Pull Request #2158 · arviz-devs/arviz · GitHub

For anyone coming to this later, I have reworked the tutorial in ArviZ documentation a bit (@OriolAbril) so that it is done directly using the model and not having to reconstruct the likelihood. I think it would not be possible to make it completely general however this is pretty close and should be just a few changes to make it work on any other model. Also @Simon,
I think there are some other problems in the example given in the link. For instance here

def log_likelihood__i(self, excluded_observed_data, idata__i):
        log_lik__i = idata.log_likelihood['y']
        return log_lik__i

What you should really be returning is the likelihood of excluded observable given the parameters from the posterior trained on non excluded observables (like in Arviz guide page). But this is just returning the log_likelihood from the original idata (not even something depending on idata__i would be correct here, since it would be log likelihood of non-excluded observables which is not what you want if I am not misunderstanding reloo). In any case, the code for the linear regression example is below, I have checked it against the example there and it does exactly the same thing.

import arviz as az
import pymc as pm
import numpy as np
import pytensor.tensor as pt
from xarray import DataArray

class PyMCLinRegWrapper(az.PyMCSamplingWrapper):

  def sample(self, modified_observed_data):

    with self.model:
      # if the model had coords the dim needs to be updated before
      # modifying the data in the model with set_data
      # otherwise, we don't need to overwrite the sample method
      n__i = len(modified_observed_data["x"])
      self.model.set_dim("time", n__i, coord_values=np.arange(n__i))
      pm.set_data(modified_observed_data)

      idata = pm.sample(
          **self.sample_kwargs,
          return_inferencedata=True,
          idata_kwargs={"log_likelihood": True}
      )
    return idata

  def log_likelihood__i(self, excluded_observed_data, idata__i):

    # get posterior trained on non excluded observables
    post = idata__i.posterior

    # create a model where x is set to
    # excluded x. value of y does not really matter
    # but it is also set.
    _model = linreg_model()
    _model.set_dim("time", excluded_observed_data["x"].size,
                   coord_values=excluded_observed_data["x"])
    pm.set_data(excluded_observed_data, model=_model)

    # define the function log likelihood(obs | param )
    value = pt.tensor("value", dtype=float, shape=())
    logp_fun = pm.compile_pymc([value, _model.b0, _model.b1, _model.sigma_e],
                                pm.logp(_model.y, value))
    logp_fun = np.vectorize(logp_fun)

    # return log likelihood (excluded obs | params trained on non excluded)
    return DataArray(logp_fun(value=excluded_observed_data["y_obs"],
                     b0=post["b0"], b1=post["b1"], sigma_e=post["sigma_e"]))


  def sel_observations(self, idx):

    xdata = self.idata_orig["constant_data"]["x"]
    ydata = self.idata_orig["observed_data"]["y"]
    mask = np.isin(np.arange(len(xdata)), idx)
    data_dict = {"x": xdata, "y_obs": ydata}
    data__i = {key: value.values[~mask] for key, value in data_dict.items()}
    data_ex = {key: value.isel(time=idx) for key, value in data_dict.items()}

    return data__i, data_ex


def linreg_model():

  coords = {"time":[]}
  x = []
  y = []
  #dont forget to set data and coords

  with pm.Model(coords=coords) as linreg_model:

    x = pm.Data("x", x, dims="time")
    y_obs = pm.Data("y_obs", y, dims="time")

    b0 = pm.Normal("b0", 0, 10)
    b1 = pm.Normal("b1", 0, 10)
    sigma_e = pm.HalfNormal("sigma_e", 10)

    pm.Normal("y", b0 + b1 * x, sigma_e, observed=y_obs, dims="time")

  return linreg_model


sample_kwargs = {"chains": 4, "draws": 500}

seed = 4
rng = np.random.default_rng(seed)
xdata = np.linspace(0, 50, 100)
b0, b1, sigma = -2, 1, 3
ydata = rng.normal(loc=b1 * xdata + b0, scale=sigma)
ydata[50] += 50 #modify an obs so it has high pareto_k


coords = {
  "time": np.arange(xdata.size)
  }

with linreg_model() as model:

  model.set_dim("time", xdata.size, coord_values=xdata)
  pm.set_data({"x":xdata, "y_obs":ydata})

  idata = pm.sample(**sample_kwargs)
  pm.compute_log_likelihood(idata, extend_inferencedata=True)

loo_orig = az.loo(idata, pointwise=True)
print(loo_orig)
1 Like

Thanks for pointing this out. If I understood the wrapper function correctly, the only thing that the log_likelihood_i function is doing, in the particular case of my code at least, is to exclude the indices not to be relooed, as I’m computing the likelihood directly in the sample function:

    def sample(self, modified_observed_data):
        with self.model(**modified_observed_data) as mod:
            idata = pm.sample(
                **self.sample_kwargs, 
                return_inferencedata=True, 
                idata_kwargs={"log_likelihood": True} )
            
        self.pymc3_model = mod
        idata.log_likelihood['y'] = idata.log_likelihood['yh']+idata.log_likelihood['yf']
        idata.log_likelihood = idata.log_likelihood.drop(['yh', 'yf'])
        #idata.log_likelihood['y'] = idata.log_likelihood['y'].mean(axis=3)
        idata = idata.log_likelihood['y']
        return idata

Here is where it is applied in the wrapper:

def reloo(wrapper, loo_orig=None, k_thresh=0.7, scale=None, verbose=True):

    required_methods = ("sel_observations", "sample", "get_inference_data", "log_likelihood__i")
    not_implemented = wrapper.check_implemented_methods(required_methods)
    if not_implemented:
        raise TypeError(
            "Passed wrapper instance does not implement all methods required for reloo "
            f"to work. Check the documentation of SamplingWrapper. {not_implemented} must be "
            "implemented and were not found."
        )
    if loo_orig is None:
        loo_orig = loo(wrapper.idata_orig, pointwise=True, scale=scale)
    loo_refitted = loo_orig.copy()
    khats = loo_refitted.pareto_k
    print('khats: '+str(khats.shape))
    loo_i = loo_refitted.loo_i
    scale = loo_orig.scale

    if scale.lower() == "deviance":
        scale_value = -2
    elif scale.lower() == "log":
        scale_value = 1
    elif scale.lower() == "negative_log":
        scale_value = -1
    lppd_orig = loo_orig.p_loo + loo_orig.elpd_loo / scale_value
    n_data_points = loo_orig.n_data_points

    # if verbose:
    #     warnings.warn("reloo is an experimental and untested feature", UserWarning)

    if np.any(khats > k_thresh):
        #print('index: '+str(idx))        
        for idx in np.argwhere(khats.values > 0.7):
            if verbose:
                _log.info("Refitting model excluding observation %d", idx)
            new_obs, excluded_obs = wrapper.sel_observations(idx)
            fit = wrapper.sample(new_obs)
            idata_idx = wrapper.get_inference_data(fit)
            log_like_idx = wrapper.log_likelihood__i(excluded_obs, idata_idx).values.flatten()
            loo_lppd_idx = scale_value * _logsumexp(log_like_idx, b_inv=len(log_like_idx))
            khats[idx] = 0
            loo_i[idx] = loo_lppd_idx
        loo_refitted.loo = loo_i.values.sum()
        loo_refitted.loo_se = (n_data_points * np.var(loo_i.values)) ** 0.5
        loo_refitted.p_loo = lppd_orig - loo_refitted.loo / scale_value
        return loo_refitted
    else:
        _log.info("No problematic observations")
        return loo_orig

This particular line:

log_like_idx = wrapper.log_likelihood__i(excluded_obs, idata_idx).values.flatten()

Maybe I’m still missing something. But otherwise I don’t get why the reloo is actually working on my models, substantially reducing the k diagnostic values:

origin model:

Pareto k diagnostic values:
                         Count   Pct.
(-Inf, 0.5]   (good)        3    1.5%
 (0.5, 0.7]   (ok)         30   15.0%
   (0.7, 1]   (bad)       124   62.0%
   (1, Inf)   (very bad)   43   21.5%

Relooed model:

Pareto k diagnostic values:
                         Count   Pct.
(-Inf, 0.5]   (good)      170   85.0%
 (0.5, 0.7]   (ok)         30   15.0%
   (0.7, 1]   (bad)         0    0.0%
   (1, Inf)   (very bad)    0    0.0%

But maybe there’s another reason explaining this. Thanks again.

So I was thinking more along the lines of how it is done in the guide:

pymc reloo with arviz

where one calls

az.reloo(pymc_wrapper, loo_orig=loo_orig)

In that case log_likelihood__i for the az.PyMCSamplingWrapper should be something that returns the likelihood of excluded observable given the parameters trained on non-excluded. Once this is set correctly then az.reloo(pymc_wrapper, loo_orig=loo_orig) does the rest for you. I see that you have sort of written your own version of reloo. What does

wrapper.log_likelihood__i

look like in your case? In the previous link you gave it was

def log_likelihood__i(self, excluded_observed_data, idata__i):
        log_lik__i = idata.log_likelihood['y']
        return log_lik_i

Don’t quite understand how this works because it does not even use the excluded_observed_data and idata__i?

I just use that part to compute the index here:

log_like_idx = wrapper.log_likelihood__i(excluded_obs, idata_idx).values.flatten()

Which allows to take the log_likelihood from the sampled idata on the segmented portion of observations in this line:

        idata.log_likelihood['y'] = idata.log_likelihood['yh']+idata.log_likelihood['yf']
        idata.log_likelihood = idata.log_likelihood.drop(['yh', 'yf'])
        #idata.log_likelihood['y'] = idata.log_likelihood['y'].mean(axis=3)
        idata = idata.log_likelihood['y']
        return idata

The reason I’ve done this is because I have two likelihoods in my model, so the wrapper was not working as it is for my case. After the model is fitted on the idata with excluded observations it is possible to take the loo stats in here:

            log_like_idx = wrapper.log_likelihood__i(excluded_obs, idata_idx).values.flatten()
            loo_lppd_idx = scale_value * _logsumexp(log_like_idx, b_inv=len(log_like_idx))
            khats[idx] = 0
            loo_i[idx] = loo_lppd_idx
        loo_refitted.loo = loo_i.values.sum()
        loo_refitted.loo_se = (n_data_points * np.var(loo_i.values)) ** 0.5
        loo_refitted.p_loo = lppd_orig - loo_refitted.loo / scale_value
        return loo_refitted

The other tricky thing with my model, which pushed me to this approach, is that the priors have the same length as observations, so any loo must be a reloo, as observations need to be excluded one by one (that’s why it gets the very high k values to start with).

Sorry for the contrived explanation, maybe it still doesn’t make much sense. When I have some time (not so near future :/) I’ll try to apply the wrapper as it is to compare, maybe to some version of the model with a single likelihood (which may not be possible). Thanks for raising this up, helps to double check.