BART Model convergence problem

Hi!
Apologies if the question is obvious.

Currently I’m working on Bart model and my goal is to forecast a time series data. I split the data into training and test set, and both of them look quite good. However, the model has failed to be convergent (checked by pmb.plot_convergence). My question is should I improve the current Bart model (if yes then could you recommend how I should do it) or the current model is okay to be used?

with pm.Model() as model:
    data_X = pm.MutableData("data_X", X_train)
    Y = Y_train
    μ = pmb.BART("μ", data_X, Y, m=50)
    σ = pm.HalfNormal("σ", Y.std())
    y = pm.Normal("y", μ, σ, observed=Y, shape=μ.shape)
    idata = pm.sample(random_seed=RANDOM_SEED)

Showing predicted vs real values in test set.

Checking the convergence. As you can see that both of the charts don’t meet the criteria of model convergence.

Thank you in advance for your help!

Hi, usually when models that I fit don’t converge I start by taking a look at how I’ve specified my priors. Out of curiosity what is the value of Y.std() are your Y values between 0 and 1? I see on the graph that they are percentages but in the data are they bound between 0 and 1? What happens if you change your prior on sigma?

Hi @FanniLy

It’s not obvious at all. It is known that BART models (not just PyMC-BART) can manage to provide good fit/predictions and still show convergence issues. So how to diagnose BART models and how to improve convergence is still an open research issue.

Adding to @Dekermanjian suggestions, you may want to try rescaling the data to the [0, 1] interval and using a Beta distribution (with mu, nu parametrization). For nu you can use pm.HalfCauchy(“nu”, 1) if you have no better idea, nu is a concentration parameter, the higher the smaller the variance. Given your example data I expect a “very high” number.

Additionally, you can tweak a little bit the sampler. PGBART sampler has two parameters, the number of particles (defaults to 10) and the batch defaults to (0.1, 0.1) meaning 10% of the m trees are updated at each step during tuning and the same after tuning. In my experience increasing these numbers can sometimes help, but usually, the benefit is small and the computational cost can be high. To change this parameter you can pass a dictionary to pm.sample, like pgbart={"num_particles": 10, "batch":(0.1, 0.1)}