Apparent pathological behavior

The function of reducedOp(pr-vec) is non-trivial unfortunately. It approximates the solution to a PDE using parameter space reduction techniques. The pr_vec variable is a vector of coefficients for a series of parameter basis functions which are summed over to give the parameter field used for the input into the PDE model. The output vector yr is comprised of values extracted from the PDE solution at discrete points. The goal of the MCMC model is to recover the pr_vec that reproduces the measurement data yr_obs.

I managed to get a decent amount of samples to run with your suggestions and also by reducing the target_accept back to the default 0.8. I ran 1000 tunes and 2000 draws per core using 2 cores. Below is the model formulation:

with pm.Model() as ERTModel:
    prmu = pm.Normal('prmu', mu=0., sd=1., shape=np)
    pr_vec = pm.Normal('pr_vec', mu=prmu, sd=2.5, shape=np)    # Parameter coefficients not centered around 
    tau = pm.HalfNormal('tau', sd=2.5)
    yr_tilde = pm.Normal('yr_tilde', mu=0, sd=2.5, shape=Ns*(Ne-1))
    yr = pm.Deterministic('yr', reducedOp(pr_vec) + tau * yr_tilde)
    sigma = pm.HalfNormal('sigma', sd=2.5, shape=Ns*(Ne-1))
    y_rec = pm.Normal('y_rec', mu=yr, sd=sigma, observed=y_obs)
    trace = pm.sample(2000, tune=1000, cores=2, nuts_kwargs=dict(target_accept=.8, max_treedepth=10))#, progressbar=False)

This model ran a lot faster (2.06s/draw) but still has a very few number of effective samples and the Gelman-Rubin statistic is higher than 1.4 for some samples. Below is the output summary from running the model:

I found the energy_error (1 below) and mean_tree_accept (2 below) plots to be interesting but couldn’t really make sense of them. It seems that the amount of proposed transitions that are accepted dramatically decreases when the sample has finished tuning process (?) and i’m not sure why.

energy_errormean_tree_accept

The step_size plot revealed a jump from 0.00055 to 0.00075 at draw 2000.

I also found it interesting that if i ran the same model for much fewer draws and tunes, the number of effective samples was much better. Refer output below:

Do you think my problem is still poor parametrisation? Or is there potentially some other issue i’m seeing?