I can't get rid of divergences in Extreme Value Analysis

davidsuculum · January 11, 2024, 11:11am

Hi,

I’m trying to fit a GEV with the following data, taking from the annual maxima (block maxima approach):

array([ 3.698211 , 12.084102 , 15.341089 ,  9.201157 ,  7.0970197,
       13.50453  , 26.924686 , 16.444876 ,  9.672132 ,  8.024217 ,
       22.364328 , 16.575699 ,  6.1274405,  8.381071 , 16.532887 ,
       14.530084 ,  5.5311694,  8.343545 , 13.957676 ,  5.140601 ,
       23.630177 ], dtype=float32)

Very few data, I know. Another option would be to use Peak Over Threshold instead of Block Maxima, and I think that would add more data, but Peak Over Threshold needs a Generalized Pareto Distribution, and I can’t find that distribution in PyMC right now.

I did, following Generalized Extreme Value Distribution — PyMC example gallery :

import pymc as pm
import numpy as np
import pymc_experimental.distributions as pmx

# Consider return period of 5 years (p=1/5)
p = 1/5

with pm.Model() as model:
    # Priors
    μ = pm.Normal("μ", mu=2, sigma=2)
    σ = pm.HalfNormal("σ", sigma=2)
    ξ = pm.Normal("ξ", mu=1, sigma=1.5)

    # Estimation
    gev = pmx.GenExtreme("gev", mu=μ, sigma=σ, xi=ξ, observed=data)
    # Return level
    z_p = pm.Deterministic("z_p", μ - (σ / ξ) * (1 - (-np.log(1 - p)) ** (-ξ)))

# The inference

with model:
    trace = pm.sample(
        5000,
        #cores=4,
        chains=4,
        tune=1000,
        #initvals={"μ": 2, "σ": 1.0, "ξ": 1},
        target_accept=0.98,
    )
    
# add trace to existing idata object
idata.extend(trace)

I have done many tests changing the priors, the initialization of the priors and I have standardized the data, but I can’t get rid of the so many divergences. Sometimes even the NUTS stops because the initialization doesn’t work.

I could perform a frequentist analysis with “PyExtremes”, so I’d say there should be some combination of parameters allowing also a Bayesian analysis. Any help would be very much appreciated. Thx.

cluhmann · January 11, 2024, 2:08pm

Not knowing much about this type of analysis, I would begin with a prior predictive check. Doing so suggests that your priors (and model) are generating some values that are much, much larger than your observed data (10-15 orders of magnitude larger). This suggests to me that the GEV is not a great fit to your data, at least with the specific priors you have select. I fiddled a tiny bit with xi, cranking it’s prior down to zero with relatively tight sigma, and that seemed to eliminate the divergences. No clue whether this is appropriate, but it’s more evidence that your divergences may have to do with your priors.

davidsuculum · January 12, 2024, 9:04am

Thanks, that seems to solve it. I changed xi to

ξ = pm.Normal("ξ", mu=0, sigma=0.01)

and it makes sense (if xi is close to 0 we have a Gumbel type of extreme value distribution). Could you please tell what made you think that the xi could be the key parameter to change to solve the problem?

cluhmann · January 12, 2024, 1:45pm

I saw a loc, a scale, and a shape parameter. Give my experience with other 3-parameter distributions, I figured the shape parameter might be causing the insanely long right tail (i.e., I assumed that the scale parameter controlled something a bit more symmetric). So dumb luck basically.

Topic		Replies	Views
Extreme Value Analysis with Pymc3 Questions	1	555	May 11, 2021
Generalized Extreme Value analysis in pymc3 Questions	4	1854	September 16, 2021
Initial evaluation of model at starting point failed, when attempting to use truncated model version agnostic	7	50	November 4, 2024
Posterior Predictive Sampling Error : Custom Distribution Questions	0	295	September 21, 2021
Gamma-Gamma Spend Model: differences in likelihood vs distribution fit	5	45	October 23, 2024

I can't get rid of divergences in Extreme Value Analysis

Related topics