Interpreting the results of pymc3

ded · September 22, 2018, 11:57pm

I am having a hard time interpreting the results of pymc3. My apologies in advance, I realize this is 101.

I have a custom model, and I generate my own data for tests. I use pymc3 to recover the parameters of my artificially generated dataset to see if it’s working.

find_MAP is far off. I guess that’s ok.
The means of the parameters after sampling are way far off.
The logp of the true parameters is better than the logp of the mean posteriors.

I would have suspected my model if it was not for the logp being better using the true params.
Somehow, I was assuming that mcmc would sample around parameters having better logp. But, from what I see, the posterior distributions of the parameters are centered around completely wrong areas.

My true params are Unseen: 8000, alpha: .25 and beta: 5.
For these particular parameters, I use Uniform with a pretty large interval (I don’t have any prior)

At the end of the day, my goal is to report the estimated true parameters with confidence intervals. I assumed that taking the mean of the parameters and their standard deviation is how this should be done with pymc3.
Does this make sense?

falk · September 23, 2018, 5:40am

Dear @ded,

this seems to be a broad question, but I think you understood a lot.
Since you write that data is generated, and that the means are off, I would guess your model is off. Note that this could be both the model being faultily specified, or the generation not generating what you want. Could you provide the code, please?
Now regarding your bigger question, how to interpret model results. This depends on the model context. I am not well trained in non-Bayesian statistics (apologies too my stats professors, but this is what all evidence suggests), so I have a hard time understanding what you try to achieve with “logp”. Note that some terms (such as “likelihood”) are ambiguous in different statistical schools, there’s a great talk by Richard McElreath here (youtube). Not having “any prior” might also be something to doubt, as I think is also discussed in the video.
In general, authors argue to not talk about “confidence intervals”. Instead, people report “credible intervals” (I got that first from Gelman’s book). This usually describes the 95% interval of highest probability density of the posterior.

Again, more can be said on inspection of your model.

Cheers,

Falk

rosgori · September 25, 2018, 4:42am

First of all, change the sample to this

    new_trace = pm.sample(5000, chains=4)

Then, compare the trace plots (if you want to, show us this trace plot). What changes do you see?

ded · October 2, 2018, 4:39pm

Thank you all for the advice!

I didn’t share the code because I am using a lengthy custom likelihood function and my code isn’t exactly readable yet
Nonetheless, based on your insights and the weird results I’ve been getting, I discovered that my model was indeed misspecified, which explains the divergent results.

My question with respect to the results interpretation remains.
So far, in all successful examples I have tried (regardless of the model really), the trace plots tend to be gaussian distributed around the true parameter. Now, every time the plots show something different than that, I start to be suspicious about my code.

I am new and slowly catching up on Bayesian inference, using pymc3. What I am asking is whether there are guidelines or detailed examples that explain how to interpret and report the results in a proper way.

Thanks!

Topic		Replies	Views
Maximum likelihood estimation of a Bayesian model Questions	11	5444	April 12, 2018
Unconsistent logp, Evaluating the results of find_MAP on Custom Likelihood Questions	3	581	August 28, 2018
Regarding setting up model for Bayesian analysis using PyMC3 Questions	3	362	September 23, 2021
Pymc tutorial page example beta_1 is very off?	7	37	November 11, 2024
Unexpected results Questions	2	493	February 15, 2019

Interpreting the results of pymc3

Related topics