Problem with pm.Categorical

sammosummo · November 28, 2017, 4:31pm

I’m getting strange behavior with a very simple model with a categorical likelihood. I expect that the issue is my lack of experience with this particular kind of model.

Here a minimal example that illustrates the problem:

import numpy as np
import pymc3 as pm
import theano.tensor as tt
import matplotlib.pyplot as plt


with pm.Model():

    y = np.random.choice(4, 1000, p=[0.1, 0.2, 0.6, 0.1])
    print(y)

    probs = []

    for k in range(4):

        _p = pm.Beta(name='p%i' % k, alpha=1, beta=1)
        probs.append(_p)

    p = tt.stack(probs)
    pm.Categorical(name='y', p=p, observed=y)

    trace = pm.sample(draws=10000, tune=2000)
    pm.traceplot(trace)
    plt.savefig('traceplot.png')

I think this model should converge on the correct values of p, but it doesn’t:

Any ideas what’s going wrong here?

junpenglao · November 28, 2017, 8:31pm

pm.Categorical normalized the input vector p so that the p.sum(-1) == 1

github.com

pymc-devs/pymc3/blob/master/pymc3/distributions/discrete.py#L510


    rescaled otherwise.
"""


def __init__(self, p, *args, **kwargs):
    super(Categorical, self).__init__(*args, **kwargs)
    try:
        self.k = tt.shape(p)[-1].tag.test_value
    except AttributeError:
        self.k = tt.shape(p)[-1]
    self.p = p = tt.as_tensor_variable(p)
    self.p = (p.T / tt.sum(p, -1)).T
    self.mode = tt.argmax(p)


def random(self, point=None, size=None, repeat=None):
    def random_choice(k, *args, **kwargs):
        if len(kwargs['p'].shape) > 1:
            return np.asarray(
                [np.random.choice(k, p=p)
                 for p in kwargs['p']]
            )
        else:

This is more transparent if you do:

preal = [0.1, 0.2, 0.6, 0.1]
y = np.random.choice(4, 1000, p=preal)
print(y)
with pm.Model():
    probs = []
    for k in range(4):
        _p = pm.Beta(name='p%i' % k, alpha=1, beta=1)
        probs.append(_p)

    p = tt.stack(probs)
    p1 = pm.Deterministic('p', p/p.sum())
    pm.Categorical(name='y', p=p1, observed=y)

    trace = pm.sample(draws=10000, tune=2000)

pm.traceplot(trace, varnames=['p'], lines=dict(p=preal))

sammosummo · November 28, 2017, 8:43pm

Thanks! Looks like I was being completely dumb!

junpenglao · November 28, 2017, 9:02pm

lol. BTW you might want to use a Dirichlet prior so it generates unit vector p

chris · December 8, 2017, 12:06am

For future reference, here is the codification of junpenglao Dirichlet prior.

preal = [0.1, 0.2, 0.6, 0.1]
y = np.random.choice(4, 1000, p=preal)

with pm.Model():
    
    p = pm.Dirichlet('p', a=np.ones(4))

    pm.Categorical(name='y', p=p, observed=y)

    trace = pm.sample(draws=1000, tune=200)

pm.traceplot(trace, varnames=['p'], lines=dict(p=preal))

Topic		Replies	Views
Categorical Prior Distribution Questions	6	1155	March 1, 2019
Strange error with Categorical distribution Questions	3	480	August 15, 2018
Create an array of categorical variable and get the logp Questions	2	1098	October 26, 2018
pm.Categorical for matrix of probabilities Questions	3	577	February 22, 2019
Trouble specificying X \| a, b, c, d ~ Categorical( . ) Questions	5	503	March 2, 2019

Problem with pm.Categorical

Related topics