Interpretation of standard errors is still a bit of an open question, see for example [2008.10296] Uncertainty in Bayesian Leave-One-Out Cross-Validation Based Model Comparison.
The main reference for that is probably the paper linked in the docstring (which we should fix and format as proper references): [1704.02030] Using stacking to average Bayesian predictive distributions