ValueError: probabilities are not non-negative when trying to sample prior predictive

I’m trying to sample the prior predictive for a model I’ve built. The model looks like this:

with pm.Model() as model:

    alpha = pm.Normal('alpha', mu=0, sd=10)
    beta = pm.Normal('beta', mu=0, sd=10, shape=D)  # D = 15

    latent = pm.Deterministic(
        'latent',
        alpha + pm.math.dot(X, beta)
    )

    cutpoints = pm.Normal(
        'cutpoints',
        mu=[.5, 1.5, 2.5, 3.5],
        sd=1,
        shape=4,
        transform=pm.distributions.transforms.ordered
    )
    
    likelihood = pm.OrderedLogistic(
        'likelihood',
        eta=latent,
        cutpoints=cutpoints,
        observed=actual
    )

actual is a vector of Likert-type scores taking values (0, 1, 2, 3, 4). The goal is to estimate the latent score underneath the discretized score and understand what contributes to that latent score. I can get a trace for this model (though it is quite slow). I can get a posterior predictive check with sample_posterior_predictive and that works okay. But if I try:

with model:
    prior_check = pm.sample_prior_predictive()

then there’s an exception thrown:

ValueError: probabilities are not non-negative
ValueError                                Traceback (most recent call last)
/home/cdsw/.local/lib/python3.6/site-packages/pymc3/distributions/distribution.py in _draw_value(param, point, givens, size)
    800                 try:
--> 801                     return dist_tmp.random(point=point, size=size)
    802                 except (ValueError, TypeError):

/home/cdsw/.local/lib/python3.6/site-packages/pymc3/distributions/discrete.py in random(self, point, size)
    974                                 dist_shape=self.shape,
--> 975                                 size=size)
    976 

/home/cdsw/.local/lib/python3.6/site-packages/pymc3/distributions/distribution.py in generate_samples(generator, *args, **kwargs)
    959     if dist_bcast_shape[:len(size_tup)] == size_tup:
--> 960         samples = generator(size=dist_bcast_shape, *args, **kwargs)
    961     else:

/home/cdsw/.local/lib/python3.6/site-packages/pymc3/distributions/dist_math.py in random_choice(*args, **kwargs)
    337         p = np.reshape(p, (-1, p.shape[-1]))
--> 338         samples = np.array([np.random.choice(k, p=p_) for p_ in p])
    339         # We reshape to the desired output shape

/home/cdsw/.local/lib/python3.6/site-packages/pymc3/distributions/dist_math.py in <listcomp>(.0)
    337         p = np.reshape(p, (-1, p.shape[-1]))
--> 338         samples = np.array([np.random.choice(k, p=p_) for p_ in p])
    339         # We reshape to the desired output shape

mtrand.pyx in numpy.random.mtrand.RandomState.choice()

ValueError: probabilities are not non-negative

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
in engine
      1 with model:
----> 2     t = pm.sample_prior_predictive()

/home/cdsw/.local/lib/python3.6/site-packages/pymc3/sampling.py in sample_prior_predictive(samples, model, vars, var_names, random_seed)
   1493     names = get_default_varnames(vars_, include_transformed=False)
   1494     # draw_values fails with auto-transformed variables. transform them later!
-> 1495     values = draw_values([model[name] for name in names], size=samples)
   1496 
   1497     data = {k: v for k, v in zip(names, values)}

/home/cdsw/.local/lib/python3.6/site-packages/pymc3/distributions/distribution.py in draw_values(params, point, size)
    618                                         point=point,
    619                                         givens=temp_givens,
--> 620                                         size=size)
    621                     givens[next_.name] = (next_, value)
    622                     drawn[(next_, size)] = value

/home/cdsw/.local/lib/python3.6/site-packages/pymc3/distributions/distribution.py in _draw_value(param, point, givens, size)
    808                     with _DrawValuesContextBlocker():
    809                         val = np.atleast_1d(dist_tmp.random(point=point,
--> 810                                                             size=None))
    811                     # Sometimes point may change the size of val but not the
    812                     # distribution's shape

/home/cdsw/.local/lib/python3.6/site-packages/pymc3/distributions/discrete.py in random(self, point, size)
    973                                 broadcast_shape=p.shape[:-1] or (1,),
    974                                 dist_shape=self.shape,
--> 975                                 size=size)
    976 
    977     def logp(self, value):

/home/cdsw/.local/lib/python3.6/site-packages/pymc3/distributions/distribution.py in generate_samples(generator, *args, **kwargs)
    958         )
    959     if dist_bcast_shape[:len(size_tup)] == size_tup:
--> 960         samples = generator(size=dist_bcast_shape, *args, **kwargs)
    961     else:
    962         samples = generator(size=size_tup + dist_bcast_shape, *args, **kwargs)

/home/cdsw/.local/lib/python3.6/site-packages/pymc3/distributions/dist_math.py in random_choice(*args, **kwargs)
    336         # iterate calls using the last axis as the category probabilities
    337         p = np.reshape(p, (-1, p.shape[-1]))
--> 338         samples = np.array([np.random.choice(k, p=p_) for p_ in p])
    339         # We reshape to the desired output shape
    340         samples = np.reshape(samples, out_shape)

/home/cdsw/.local/lib/python3.6/site-packages/pymc3/distributions/dist_math.py in <listcomp>(.0)
    336         # iterate calls using the last axis as the category probabilities
    337         p = np.reshape(p, (-1, p.shape[-1]))
--> 338         samples = np.array([np.random.choice(k, p=p_) for p_ in p])
    339         # We reshape to the desired output shape
    340         samples = np.reshape(samples, out_shape)

mtrand.pyx in numpy.random.mtrand.RandomState.choice()

ValueError: probabilities are not non-negative

I’m having a hard time understanding why this happens. Any help? Is my model mis-specified? Am I using sample_prior_predictive() incorrectly?

2 Likes

I think there is an issue with the random generator of the OrderedLogistic (a sub class of Categorical).
It seems that the probabilities it receives (p in the code below) to choose outputs (“actual”) are non negative. I think that the ordered constraint does not work when sampling from the prior distribution which which causes negative probabilities.

samples = np.array([np.random.choice(k, p=p_) for p_ in p])

I’m having the same issue (with code from here ) when trying to get a prior predictive check.

One “fix” would be to remove the name of the observed variable from the var names of the prior sample:

prior_check = pm.sample_prior_predictive(var_names=[‘alpha’,‘beta’,‘latent’,‘cutpoints’])

Then, once you sample ‘cutpoints’ and ‘latent’ from their priors you can generate a prior predictive distribution outside of your model (I tried that, and noticed that the probabilities are indeed negative since the cutpoints are not ordered).

Not sure on how to fix the original problem or who should I inform. Any help would be appreciated.

2 Likes

Interesting, and thanks for the deep dive @nrakocz!
Pinging @lucianopaz, as I seem to remember he already worked with OrderedLogistic and is well versed in general in forward sampling functions – he’ll probably have good advice if you wanna do a PR and fix this :wink:

@JaredStufft, your model is fine and, @nrakocz, there isn’t a problem with the random number generator from the ordered logistic distribution. What’s happening is that the ordered logistic does not sort the cut points. This is explicitly stated in the distribution’s docstring.

What’s really causing this problem is that transformations (in your case the ordered transform) and also potentials are not applied during forward sampling. This is a known issue in pymc3 (I’ll try to add some links to issues on GitHub and also discourse threads later). At the moment, the only thing that you can do is to pass theano.tensor.sort(cutpoints) to the ordered logistic distribution. This shouldn’t have a big performance impact during inference and posterior predictive sampling, but should prevent the problems you are seeing with your prior predictive.

5 Likes

Thanks, @AlexAndorra & @lucianopaz!

You’re never disapointed when Luciano explains something :wink:

2 Likes