Generalized Extreme Value analysis in pymc3

I am trying to train a generalized extreme value model with the code below. The model is described in Stuart Coles book “An Introduction to Statistical Modeling of Extreme Values” and I have used the same data port pirie sea levels. The code is:

with pm.Model() as model_gev:
    loc = pm.Normal('loc', mu=3, sigma=10)
    sig=pm.Normal('sigma', mu=0.24, sigma=10)
    xi = pm.Normal('xi', mu=0, sigma=10)

    def gev_logp(value):
        scaled = (value - loc) / sig
        logp = -(log(sig)
                 + ((xi + 1) / xi) * tt.log1p(xi * scaled)
                 + (1 + xi * scaled) ** (-1/xi))
        alpha = loc - sig/xi
        bounds = tt.switch(xi > 0, xi > alpha, xi < alpha)
        return bound(logp, bounds, xi!= 0)

    gev = pm.DensityDist('gev', gev_logp, observed=series)
    step = pm.Metropolis()
    trace = pm.sample(10000, chains=4, tune=3000,step=step,start={'loc': 1.0, 'sigma': 1.0, 'xi': 1.0})

I get the following estimates which is very close to maximum likelihood estimates:

loc=3.88, sigma=0.21 and xi=-0.07. rhat is 1.

My question is the density plots of the parameters. Do they look reasonable?

I can’t say much without spending some time to understand the model, but those traceplots look okay.

There’s plenty of resources around here to help with model evaluation, a great start is: A Tour of Model Checking techniques by Rob Zinkov

Thanks Jonathan for the review. Just needed a second opinion. The link to Robs presentation was excellent. Soren.

1 Like

Hi @sojohan : just seeing this now and hope this is of use to others:

Yes, the estimates are good. Coles uses both the delta method and the profile likelihood to look at the variability in the parameter and return level estimates. So, look at his Fig 3.2 and compare to yours for the shape parameter - it’s excellent! He also reports the 95% Confidence Intervals (delta method - assumes normality) as follows:

[3.82,3.93] for loc, [0.158,0.238] for sigma, and [-0.242,0.142] for xi

So nice job! I intend adding the GEV to PyMC3 through a PR soon.


Thanks Colin. I have actually made a couple of test on different standard datasets both for Block Maxima and Peak over thresholds. My idea was to publish these cases on TowardsDataScience.

1 Like