Hm, if they are discarded that would make sense, thank you! So when there is a diverging trajectory, the actual diverging sample is discarded and a new non-diverging one is drawn from earlier in the trajectory? But it’s still marked as diverging?
If so would it be correct to conclude that:
- The actual discarded divergent samples are available in the sampling reports?
- The divergent samples in the sampling reports should be negative?
I don’t really understand the structure of the sampling report (is it documented anywhere?) but here is what I attempted:
idata_warns = idata.sample_stats.where(idata.sample_stats["warning"] != None, drop=True)
warnings = idata_warns["warning"].sel(warning_dim_0=0).stack(sample=["chain", "draw"]).dropna(dim="sample")
divergence_point_source = [w.item().divergence_point_source for w in warnings]
divergence_point_dest = [w.item().divergence_point_dest for w in warnings]
This should get the details of the raw RVs as seen by the sampler. I then transform these values to obtain the group_means (which are the RVs that would actually be clipped):
div_group_means_source = [d['pop_mean'] + np.exp(d['pop_sigma_log__']) * d['group_z'] for d in divergence_point_source]
div_group_means_dest = [d['pop_mean'] + np.exp(d['pop_sigma_log__']) * d['group_z'] for d in divergence_point_dest]
Now if I understood it correctly, my expectation is that all entries in div_group_means_source should be positive and all entries in div_group_means_dest should be negative:
(np.array(div_group_means_source) < 0).any()
True
So far so good…
(np.array(div_group_means_dest) < 0).any(axis=1).sum()/len(div_group_means_dest)
0.8217054263565892
Ok, 82% of the divergences contains a negative group_means that would have needed to be clipped. So that’s mostly consistent with my understanding by above. But 82% is not 100%. Why would there be a divergence if there are no negative group_means in that draw? Isn’t clip identical to the identity function in that case?