How to save logp values during sampling?

madanh · January 29, 2018, 9:01am

I asked a similar question here, but this time I’d like know if there’s a way to add the logp values of accepted points to the trace as we sample, so that we don’t have to evaluate it twice?

junpenglao · January 29, 2018, 9:20am

You are asking some great questions These should go into our docs I think…
There is two way to do it, the easiest is to create a Deterministic RV to save the logp. This is what we do in SMC currently:

github.com

pymc-devs/pymc3/blob/4357100514cc9615c8c2598ab3fe0d9a36c34c7d/pymc3/step_methods/smc.py#L133-L134


with model:
    llk = pm.Deterministic(likelihood_name, model.logpt)

A more complicated way is to put the logp into the sampler statistics of the trace, there is a discussion here:
https://github.com/pymc-devs/pymc3/pull/2339#pullrequestreview-45401866

Pitjip · June 21, 2019, 6:48am

Interesting thread! Is there any way to save the logp for each data point as an array? So that each element becomes a vector instead of a scalar?

junpenglao · June 21, 2019, 9:36am

Depending on how you define the logp for each data point (i.e., element-wise logp), you can computed it using the logprob function of the observed conditioned on the posterior using:

github.com

pymc-devs/pymc3/blob/596db1ad5f2c72ba0c7207f63bce2b2b29c36bbe/pymc3/stats.py#L125


    if lag is None:
        return acov
    else:
        warnings.warn(
            "The `lag` argument has been deprecated. If you want to get "
            "the value of a specific lag please call `autocov(x)[lag]`.",
            DeprecationWarning)
        return acov[lag]




def _log_post_trace(trace, model=None, progressbar=False):
    """Calculate the elementwise log-posterior for the sampled trace.


    Parameters
    ----------
    trace : result of MCMC run
    model : PyMC Model
        Optional model. Default None, taken from context.
    progressbar: bool
        Whether or not to display a progress bar in the command line. The
        bar shows the percentage of completion, the evaluation speed, and

Pitjip · June 21, 2019, 12:17pm

Thanks for you help. Additionally, is it possible to use this capability with a new set of observed values (y)?
I want to make a dataframe with the following columns: [parameter 1, parameter 2, Y, (log)likelihood]
This will allow me to plot multiple paths of the PDF of the distribution in a Baysian way. I could not find this functionality in the package. Thanks!

junpenglao · June 21, 2019, 12:43pm

I dont think there is built in function for that. I would probably extract the parameters from the trace by hand and compute the logp conditioned on the new observation by hand.

ally-lee · July 28, 2020, 6:12pm

I know this is an old issue, but I thought I’d share a solution to compute the logp of a set of new observations. You can use the same trick as above and define a Deterministic RV for the logp. Then use sample_posterior_predictive. Assuming you’ve already produced a posterior sample trace, and y_future is your new observed variable, use the following code:

with model:
    llk = pm.Deterministic('llk', y_future.logpt)
    logp = pm.sample_posterior_predictive(trace, vars=[llk], keep_size=True)

Set keep_size=True to compute the logp for each sample of the trace and retain the shape ((n_chains, n_samples)).

Topic		Replies	Views
Saving prior values for each sample as deterministic RVs Questions	3	489	October 31, 2019
Outputting loglikelihood of each parameter set Questions	10	630	March 22, 2022
Point-wise log-likelihood for black-box model in PyMC v4 v5	2	956	June 10, 2022
How to save trace in between sampling and then resume sampling later on? version agnostic modeling , sampling , arviz	0	29	April 1, 2025
How to create and evaluate logp of Deterministic MvNormal as part of PyMC model? (Or use Potential) v5 modeling	1	334	April 24, 2023

How to save logp values during sampling?

Related topics