Bad initial energy and setting test values for ordered transform

Hi all,

I’m trying to run an ordinal logistic regression, but am running issues when trying to set test values. The model I’m using is below,

with pm.Model() as model:

    # weights
    w_mu = pm.Laplace("w_mu", mu=0, b=1, shape=n_items)
    w_sigma = pm.HalfCauchy("w_sigma", beta=5, shape=n_items)

    w_raw = pm.Normal("w_raw", mu=0, sd=1, shape=(n_groups, n_items))
    w = pm.Deterministic("w", w_mu + w_raw * w_sigma)

    # thresholds
    mu_thresholds = pm.Normal("mu_thresholds", mu=0, sd=5, shape=4)
    sigma_thresholds = pm.HalfCauchy("sigma_thresholds", beta=5, shape=4)      

    thresholds = pm.Normal(
        "thresholds",
        mu=mu_thresholds,
        sd=sigma_thresholds,
        shape=4,
        testval=np.array([-2, -1, 1, 2]),
        transform=pm.distributions.transforms.ordered
    )

    wx = tt.sum(w[group_codes] * X_, axis=1)
    y_ = pm.OrderedLogistic(
        "y", 
        cutpoints=thresholds,
        eta=wx,
        observed=y
    )

I keep seeing an error related to bad initial energy, so after looking at this previous post, I checked the logp values, none of which are NaN or inf, but the test values being used for the thresholds variable aren’t what I specified.

{'mu_thresholds': array([0., 0., 0., 0.]),
 'sigma_thresholds_log__': array([1.60943791, 1.60943791, 1.60943791, 1.60943791]),
 'thresholds_ordered__': array([-2.        ,  0.        ,  0.69314718,  0.        ]),
 'w_mu': array([0., 0., 0., 0., 0., 0.]),
 'w_raw': array([[0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0.],
        ...,
        [0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0.]]),
 'w_sigma_log__': array([1.60943791, 1.60943791, 1.60943791, 1.60943791, 1.60943791,
        1.60943791])}

The test values aren’t ordered, which makes me think (according to the post I linked to) that this could be the issue. How can I specify the test values properly for the model?

Thanks!

Hmmm, this is a bit suprising as your model specification look correct to me. Would you be able to share the data as well?

Thanks for your help! I can’t share the data, but using this mock data recreated the issue,

    np.random.seed(100)
    X = np.random.randint(-2, 3, size=600).reshape(-1, 6)
    y = np.random.randint(0, 5, size=100)
    group_codes = np.random.randint(0, 10, size=100)
    n_groups = len(np.unique(group_codes))
    n_items = X.shape[1] 

I should also mention that this is running on python 2.7 and pymc3 version 3.6.

I should also mention that, for testing, I’m sampling using,

        trace = pm.sample(
            draws=1000,
            chains=2,
            cores=1,
            nuts_kwargs=dict(target_accept=0.95),
            tune=1000
        )

Oh right, the problem is that the default initialization ‘jitter+adapt_diag’ randomize the initial value by adding some uniform jitter, which makes the ordered variable un-ordered. Something like this should work:

with model:
    trace = pm.sample(init='adapt_diag')
1 Like

Amazing, that solved it. I feel very silly now. Thanks for your help!