Problem with categorical index variable in v5

I am trying to build a vectorized model. The essence of the model is below, but the entire model comes from Statistical Rethinking (the full thing can be found here: pymc-resources/Rethinking_2/Chp_08.ipynb at 93fa3ae6f485e86b9c6cfdf15554cb3abd281122 · pymc-devs/pymc-resources · GitHub, Code 8.13, model m_8_13).

import numpy as np
import pandas as pd
import pymc as pm

cid = pd.Categorical(np.array([1,0,0,0,0,1]))
vec = np.array([0.2,0.3,0.4,0.5,0.6,0.7])
with pm.Model():
    a = pm.Normal("a", 1, 0.1, shape = cid.categories.size)
    b = pm.Normal("b", 0, 0.3, shape = cid.categories.size)
    mu = a[cid] + b[cid] * vec

This fails with pymc 5.10.2, but used to work with aesara (as shown among others in the notebook linked above). The error message I am getting:

---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
Cell In[10], line 11
      9 a = pm.Normal("a", 1, 0.1, shape = cid.categories.size)
     10 b = pm.Normal("b", 0, 0.3, shape = cid.categories.size)
---> 11 mu = a[cid] + b[cid] * vec

File ~/miniconda3/envs/prpro-2024/lib/python3.11/site-packages/pytensor/tensor/variable.py:520, in _tensor_py_operators.__getitem__(self, args)
    512     return (isinstance(val, (tuple, list)) and len(val) == 0) or (
    513         isinstance(val, np.ndarray) and val.size == 0
    514     )
    516 # Force input to be an int datatype if input is an empty list or tuple
    517 # Else leave it as is if it is a real number
    518 # Convert python literals to pytensor constants
    519 args = tuple(
--> 520     [
    521         pt.subtensor.as_index_constant(
    522             np.array(inp, dtype=np.uint8) if is_empty_array(inp) else inp
    523         )
    524         for inp in args
    525     ]
    526 )
    528 # Determine if advanced indexing is needed or not.  The logic is
    529 # already in `index_vars_to_types`: if it succeeds, standard indexing is
    530 # used; if it fails with `AdvancedIndexingError`, advanced indexing is
    531 # used
    532 advanced = False

File ~/miniconda3/envs/prpro-2024/lib/python3.11/site-packages/pytensor/tensor/variable.py:521, in <listcomp>(.0)
    512     return (isinstance(val, (tuple, list)) and len(val) == 0) or (
    513         isinstance(val, np.ndarray) and val.size == 0
    514     )
    516 # Force input to be an int datatype if input is an empty list or tuple
    517 # Else leave it as is if it is a real number
    518 # Convert python literals to pytensor constants
    519 args = tuple(
    520     [
--> 521         pt.subtensor.as_index_constant(
    522             np.array(inp, dtype=np.uint8) if is_empty_array(inp) else inp
    523         )
    524         for inp in args
    525     ]
    526 )
    528 # Determine if advanced indexing is needed or not.  The logic is
    529 # already in `index_vars_to_types`: if it succeeds, standard indexing is
    530 # used; if it fails with `AdvancedIndexingError`, advanced indexing is
    531 # used
    532 advanced = False

File ~/miniconda3/envs/prpro-2024/lib/python3.11/site-packages/pytensor/tensor/subtensor.py:149, in as_index_constant(a)
    147     return ps.ScalarConstant(ps.int64, a)
    148 elif not isinstance(a, Variable):
--> 149     return as_tensor_variable(a)
    150 else:
    151     return a

File ~/miniconda3/envs/prpro-2024/lib/python3.11/site-packages/pytensor/tensor/__init__.py:50, in as_tensor_variable(x, name, ndim, **kwargs)
     18 def as_tensor_variable(
     19     x: TensorLike, name: Optional[str] = None, ndim: Optional[int] = None, **kwargs
     20 ) -> "TensorVariable":
     21     """Convert `x` into an equivalent `TensorVariable`.
     22 
     23     This function can be used to turn ndarrays, numbers, `ScalarType` instances,
   (...)
     48 
     49     """
---> 50     return _as_tensor_variable(x, name, ndim, **kwargs)

File ~/miniconda3/envs/prpro-2024/lib/python3.11/functools.py:909, in singledispatch.<locals>.wrapper(*args, **kw)
    905 if not args:
    906     raise TypeError(f'{funcname} requires at least '
    907                     '1 positional argument')
--> 909 return dispatch(args[0].__class__)(*args, **kw)

File ~/miniconda3/envs/prpro-2024/lib/python3.11/site-packages/pytensor/tensor/__init__.py:57, in _as_tensor_variable(x, name, ndim, **kwargs)
     53 @singledispatch
     54 def _as_tensor_variable(
     55     x: TensorLike, name: Optional[str], ndim: Optional[int], **kwargs
     56 ) -> "TensorVariable":
---> 57     raise NotImplementedError(f"Cannot convert {x!r} to a tensor variable.")

NotImplementedError: Cannot convert [1, 0, 0, 0, 0, 1]
Categories (2, int64): [0, 1] to a tensor variable.

Somehow it fails with pytensor. What should I do? I appreciate any help.

Thanks for allowing me to ask :slight_smile: thanks to writing the post above, I realized that I was looking at an old commit of the Statistical Rethinking code (the head is here: pymc-resources/Rethinking_2/Chp_08.ipynb at main · pymc-devs/pymc-resources · GitHub, still model m_8_3). And indeed this newer model has a fix. For the record, here is my MWE corrected:

import numpy as np
import pandas as pd
import pymc as pm

cid = pd.Categorical(np.array([1,0,0,0,0,1]))
vec = np.array([0.2,0.3,0.4,0.5,0.6,0.7])
with pm.Model():
    a = pm.Normal("a", 1, 0.1, shape = cid.categories.size)
    b = pm.Normal("b", 0, 0.3, shape = cid.categories.size)
    mu = a[np.array(cid)] + b[np.array(cid)] * vec

This works with pymc 5.10.2 (I do not know why the explicit numpy conversion is now needed, but this does work).