Trouble specificying X | a, b, c, d ~ Categorical( . )

Hello PyMC community,

As the title suggests, I am having trouble implementing a model involving missing data that is categorically distributed.

A brief description of the data:

The dataset has columns X, Y, a, b, c, d (simplified), with some (a, b) missing (together), and likewise with some (c, d).

Y, a, b, c, d can take values 0, 1 while X can take values 0, 1, or 2.

Description of model

4 * P(X = x) = 1(a+c == x) + 1(a+d == x) + 1(b+c == x) + 1(b+d == x) Where 1( . ) denotes the indicator function.

We have a latent parameter P = (P_0, P_1, P_2) where Y | P ~ Bernoulli(P_x)

The goal is to sample from the posterior of P.

My attempt:

cat_p = ...
# cat_p is 20 by 3 matrix with entries described above with indicator functions.
with pm.model() as model:

    p = Beta('p', alpha=1, beta=1, shape=3)
    y = Bernoulli('y', p=p, shape=3, observed=col_y)
    x = Categorical('x', p=cat_p, shape=20, observed=col_x)

An error occurs at x = Categorical(…), any idea where I should begin investigating the issue?
The error persists with only the x = Categorical line.


    g_members = pm.Categorical("g_members", p=cat_param, shape=members.size, observed=members)
  File "/home/fool/.virtualenvs/pymc-models/lib/python3.6/site-packages/pymc3/distributions/", line 41, in __new__
    dist = cls.dist(*args, **kwargs)
  File "/home/fool/.virtualenvs/pymc-models/lib/python3.6/site-packages/pymc3/distributions/", line 52, in dist
    dist.__init__(*args, **kwargs)
  File "/home/fool/.virtualenvs/pymc-models/lib/python3.6/site-packages/pymc3/distributions/", line 706, in __init__
    self.k = tt.shape(p)[-1].tag.test_value
  File "/home/fool/.virtualenvs/pymc-models/lib/python3.6/site-packages/theano/gof/", line 615, in __call__
    node = self.make_node(*inputs, **kwargs)
  File "/home/fool/.virtualenvs/pymc-models/lib/python3.6/site-packages/theano/compile/", line 252, in make_node
    x = theano.tensor.as_tensor_variable(x)
  File "/home/fool/.virtualenvs/pymc-models/lib/python3.6/site-packages/theano/tensor/", line 194, in as_tensor_variable
    return constant(x, name=name, ndim=ndim)
  File "/home/fool/.virtualenvs/pymc-models/lib/python3.6/site-packages/theano/tensor/", line 232, in constant
    x_ = scal.convert(x, dtype=dtype)
  File "/home/fool/.virtualenvs/pymc-models/lib/python3.6/site-packages/theano/scalar/", line 284, in convert
    assert type(x_) in [np.ndarray, np.memmap]

Sorry, what is the error?

Ahh, how did I leave that out… thank you @aakhmetz I just updated the original post.

[Bump] Could anyone lend some insight? Or perhaps if my problem description is not clear, how may I make clarifications?

The Categorical Random variable is not doing anything in this case, as you have no free parameters in it.
But in any case it shouldnt throw an error, so check whether you have any negative or NaN in cat_p. Also, update your pymc3 to master might help as @lucianopaz just push a PR to fix many issues in categorical.

What is the exact type of cat_p? In case it is a numpy array, what is its dtype? I’m asking because it can only be numerical or a theano tensor, not an object.