Any experience with reLOO?

I notice that ArViZ has a reloo function but I haven’t found any PyMC3 examples using it. Has anyone had success with this? I know its flagged as “experimental” right now.

I want to compare models that have 1-2 data points with LOO pareto shapes > 0.7. It would be great to calculate these LOO point without PSIS.

1 Like

It would be great if you could help testing it out. Right now there is only one example available for PyStan on the website but trying it out on real models would be amazing.

I have recently refocused on this a little with hopes to expanding to PSIS-LFO and SMC, and I started working on the PyMC3 wrapper too on this PR. It would be great if you could checkout the branch and try it. I’ll rebase it with master.

Getting it to work with PyMC3 has its issues, but it can be sorted out. The PR actually has 2 proposals that try to abstract the process as much as possible, feedback on the general API and on how these 2 alternatives compare will be most appreciated. I still want to try a 3rd option using symbolic_pymc though, I hope I’ll have time within the following 2 weeks.

I personally prefer the alternative that relies more on xarray instead of the one that uses pymc3 more intensively as I think it requires less mental overhead to use. Here is a link to the notebook which I think you’ll be able to use modifying only the sel_observations method.

The link to the second alternative is here. Unlike in the previous case, here there is no need to rewrite with numpy the pointwise log likelihood formula, but the resulting class may be slightly harder to generalize.

Let me know if you need help adapting this to your models

2 Likes

Also, the wrapper is currently thought to recompile the model every time a refit is needed because when the shape of the observations is changed the model needs to be recompiled. With LFO every refit has changes on the shape of the observations but for reloo this is not the case, if your model were really slow to compile we could look into this but at least one recompilation will be needed. With 1-2 datapoints to which reloo needs to be applied it may not be worth it to go to all this trouble.

There is an example of reloo usage with ArviZ now on one issue: https://github.com/arviz-devs/arviz/issues/1404. I though it could be interesting to people reading this.

2 Likes

I’ve looked through your examples a couple of times now, and I’m still lost. Can’t figure out how to actually use reloo.

What kind of model are you trying to use? Are the issues creating the sampling wrapper or using the wrapper to call reloo?

Hi, can you relink the notebook you are referring to? It says 404 not found and I have the same problem as sammosummo that I don’t know how to adjust / use the wrapper: I don’t know were to modify the example wrapper class and how to find the correct arguments that need to be passed? Do I need to write a custom log likelihood function in any case? How does it work with the coords and dimensions? I need to pass thee coords with

with pm.Model() as model:
    model.add_coord('obs_id',obs_id, mutable=True)

because I need to predict out of sample outcomes in the next step. In the post it says " Note that for reloo you should not use coordinate values (thus not coords argument passed to pm.Model nor to from_pymc3) because each iteration would have different values for them causing the program to error out." - but I don’t get what that means - could you elaborate?

Hi. OriolAbril kindly helped me a lot with reLOO few months ago. You can see this thread:
https://discourse.pymc.io/t/reloo-is-using-average-log-likelihoods-valid/10089
Maybe you’ll find it useful. If I remember correctly, there are several useful links in the thread that you could use as tutorials.

It was merged a while ago, it is now on arviz docs (but still runs on v3, not v4): Refitting PyMC3 models with ArviZ — ArviZ dev documentation.

Wheb doing the refits with reloo the shape of the observed data also changes, so you should either not use coords (and use dims only) or update the coords in the model as part of the sel_observations or sample methods

Thank you, I already had a look at that example but I am working with pymc 4 - is there an equivalent to “az.from_pymc3()” and other attributes like “idata__i.pymc3_trace”?

What exactly is the difference between coords and dims? Don’t I need to specify coords to then pass dims?

Do you mean within the wrapper I would need to update the coords similar to what I do with oos-preds?

Sth along the lines of

def sample(self, modified_observed_data):
        with self.model(**modified_observed_data) as model:
        model.set_data('x', x, coords = {'obs_id': obs_id})
        model.set_data('c', c, coords = {'participants': participants})
            trace = pm.sample(
                **self.sample_kwargs, 
                return_inferencedata=False, 
                idata_kwargs={"log_likelihood": False}
            )
        self.pymc4_model = model
        return trace

It is still an open PR but this should help: Refitting PyMC models with ArviZ — ArviZ dev documentation (note the 2158 (PR) in the flyout at the bottom right of the page). There were also a couple things outdated in reloo, so if you want to try it before the PR is merged you’ll need to install arviz from the specific PR: Updates to wrappers and refitting stats by OriolAbril · Pull Request #2158 · arviz-devs/arviz · GitHub