Help with GP modelling

Hi! I’m trying to use a GP to model a Wöhler curve, the relation between cycles to failure and stress in a mechanical component. The curve is obtained experimentally and for a research paper I’m trying to fit a GP to experimental data representing that curve. I’m particularly interested in the relation Stress vs. Log N, N being cycles to failure. The experimental data looks like this:


Additionally I want to recreate this tutorial since variance largely changes with the number of cycles, large number of cycles vs. low stress present very high variance in contrast to low cycle and high stress, I believe this to be a nice addition to the research.
Below is my best attempt so far:

The code for the model is as follows:

with pm.Model() as SNCurveModel:
    ℓ = pm.Gamma("ℓ", mu=ℓ_μ, sigma=ℓ_σ)
    η = pm.Gamma("η", alpha=2, beta=1)
    cov = η ** 2 * pm.gp.cov.Exponential(input_dim=1, ls=ℓ) + pm.gp.cov.WhiteNoise(sigma=1e-4)
    gp_ht = pm.gp.Latent(cov_func=cov)
    μ_f = gp_ht.prior("μ_f", X=S)

    σ_ℓ = pm.Gamma("σ_ℓ", mu=ℓ_μ, sigma=ℓ_σ)
    σ_η = pm.Gamma("σ_η", alpha=2, beta=1)
    σ_cov = σ_η ** 2 * pm.gp.cov.ExpQuad(input_dim=1, ls=σ_ℓ) + pm.gp.cov.WhiteNoise(sigma=1e-6)

    σ_gp = pm.gp.Latent(cov_func=σ_cov)
    σ_f = σ_gp.prior("lg_σ_f", X=S)
    nu = pm.Gamma("nu", alpha=2, beta=1)
    lik_ht = pm.StudentT("lik_ht",nu = nu,  mu=μ_f, sd=σ_f, observed=log_N)

    trace = pm.sample_smc(draws=2000, parallel=True)

I had some success with the SMC sampler but as showed in the previous figure the model struggles with capturing the high cycle and low stress zone. I believe this happens because of the zone after s/smax = 0.4 in which LogN are all very close to zero.
Is there any technique to model this correctly? Is there any useful transformation for the data that would simplify the modelling? Should I implement a two system model depending on the zone? If so what would be a good way?
Thanks in advance!!

Without an example data set I can try running your model on, I can’t offer any specific suggestions, but one thing that that jumps out to me is that your data is non-negative, but your likelihood pm.StudentT allows negative numbers. You might want to transform your linear predictor μ_f using pm.math.exp or log1pexp (softplus). The same is true for your scale parameter, σ_f. That one must be transformed to be positive.

I would also remove the two pm.gp.cov.WhiteNoise components in favor of using the jitter argument of pm.gp.Latent.

Also, why sample_smc instead of the default NUTS? NUTS should work fine here

Thanks for the reply!

Running main.py here should perform the calculation, it only needs pymc3 installed.
The likelihood part is interesting, and I haven’t thought about it either. That might be the reason why I had trouble with NUTS in the first place.
Will implement the changes you recommended! Thanks again