Why does `transform=pm.distributions.transforms.ordered` lead to worse convergence?

Just a wild guess but perhaps the Ordered transform is failing because of the 2D shape? Does the same happen if you do something like:

_means = pm.Normal(
        "_means",
        mu=[0, 0.1],
        sd=10,
        shape=groups,
        # Will be toggling this line
        # transform=pm.distributions.transforms.ordered,
        testval=np.array([0, 0.2]),
    )
    means = pm.Deterministic("means", tt.reshape(_means, (1, groups)) + trend)

Another guess is that one of your _means might be redundant with the alpha (they seem to both work as intercepts) and somehow adding the ordering transform makes this redundancy even more salient. In that case removing alpha should help.

2 Likes