Fit Weibull to Data using Jeffreys Prior

I’d like to fit a Weibull model to data by using a non-informative prior (e.g., Jeffreys prior), but I’m not sure how to set it up in PyMC3. I’m using the following references:

According to the latter link (page 14275), the Jeffreys prior is as follows:

\pi(\alpha, \beta) \propto \frac{1}{\alpha\beta}

How do I define that in a Model environment? I’m confused because I have a joint Jeffreys prior for both \alpha and \beta. How do I define them? Sample code is given below.

import pymc3 as pm
import theano.tensor as tt

x = ... # Data (observations)

def loglike_alpha_beta(value):
    return -tt.log(value[0]) - tt.log(value[1]) # Not sure about this

with pm.Model() as model:
    alpha = ???
    beta = ???
    like = pm.Weibull("obs", ???, observed=x)
    
    start = ...
    trace = ...

If you consider prior as the constrain you add to the model (log-)likelihood function, what you would write down is:

with pm.Model():
    alpha = pm.HalfFlat('alpha')
    beta = pm.HalfFlat('beta')
    like = pm.Weibull('observation', alpha, beta, observed=data)
    prior = pm.Potential('prior', -tt.log(alpha*beta))

However, un-informative prior does not work well with modern samplers like HMC and NUTS. Be careful when you sample from the model

1 Like

Thanks for the code and explanation! So apparently, I shouldn’t be using Jeffreys prior:

  • Some principles we don’t like: invariance, Jeffreys, entropy

And that’s fine with me. One thing in the paper I referenced that caught my attention are the following statements:

In general, MLE has high accuracy for parameter estimation when the sample size is large. However, for small sample sizes, the Bayesian approach is known to be more accurate.

And the authors cite a book (Berger, James O. Statistical decision theory and Bayesian analysis. Springer Science & Business Media, 2013.), which I don’t have access to. Do you happen to know why that’s the case? Any related references (hopefully accessible) would be appreciated!

I dont think this is true - maybe yes in some specify (low dimensional) models, but in general, you lost a lot of information using MLE.

Mmm… my problem is parameter estimation, and I’m currently considering a Weibull model. So I only have two parameters to estimate (including their uncertainty in the form of confidence intervals, or in a Bayesian world, credible intervals). And on top of that, I may have to deal with small sample sizes. So the question is: is there a preferred approach between MLE + bootstrap for confidence intervals vs. MCMC + credible intervals?

It depends on how much prior information you have and wanted to incorporate into your model. If you have relatively little prior information (ie, you can going to use an uninformative prior or a very vague prior), with a relatively small model (either in terms of parameters and sample size), the estimation using
MLE or Posterior expectation from MCMC samples are often quite similar. The difficulty is more in cases like you have a larger model (e.g., a mixed-effect model instead of a GLM), where there are much more parameters to fit, MLE often does not converge and there is not much another way to do inference besides MCMC.

As a case study, you can have a look at this Bayesian reanalysis I did for a study https://osf.io/ygh2v/
In the ipynb I compared the Bayesian posterior estimation and credible intervals with bootstrap on one of the frequentistic estimator.

Thanks for sharing all that!