UPDATE
I have thought about it a bit more and tried a different approach.
I figured to make things easier I will first try to create the model for a fixed number of three steps (i.e. three action choices made by the agent). Because every C(s) depends on the next choice except for the last one, I could start with the choice distribution for the final state and then iteratively define the ancestors. To do so, I used a CustomDist for the choices:
# These 3 methods are used for the custom distribution of the choice function.
# U is the utility function which is uniform between 0 and 1 for every state.
# next_choice follows the choice distribution, hence every state's choice distribution
# depends on the next state's choice, except for the terminal state.
def expected_utility(action, state, U, next_choice):
if is_terminal_state(state):
return 0
u = U[state % 10]
next_state = transition(state, next_choice)
next_u = U[next_state % 10]
return u + next_u
def choice_logp(action, state, U, next_choice, alpha):
eu = expected_utility(action, state, U, next_choice)
return alpha * eu
def choice_random(state, U, next_choice, alpha, rng=None, size=None):
eus = np.array([expected_utility(action, state, U, next_choice) for action in ACTIONS])
probs = (alpha * np.exp(eus)) / np.sum(eus)
action = np.array([rng.random.choice(ACTIONS, p=probs) for _ in size])
return action
# The following lines are used to create the agent model:
# In this model the agent visited a total of three states, the last of which was terminal,
# and took three actions (made three choices). Hence, there are three choice variables,
# corresponding to the first, second, and last choice. The first choice depends on the
# second and the second depends on the last.
states = [86,97,100]
actions = [1,0,0]
agent_model_fixed_length = pm.Model()
with agent_model_fixed_length:
alpha = pm.Uniform(name="alpha", lower=0., upper=1.)
U = pm.Uniform("U", lower=[0.]*POSITIONS, upper=[1.]*POSITIONS)
last_c = pm.CustomDist(
'last_c',
states[-1],
U,
pt.dscalar(),
alpha,
logp=choice_logp,
random=choice_random,
observed=[actions[-1]],
)
middle_c = pm.CustomDist(
'middle_c',
states[-2],
U,
last_c,
alpha,
logp=choice_logp,
random=choice_random,
observed=[actions[-2]],
)
first_c = pm.CustomDist(
'first_c',
states[0],
U,
middle_c,
alpha,
logp=choice_logp,
random=choice_random,
observed=[actions[0]],
)
When I print the graph of this model it looks as expected and I can also sample from it just fine.
Of course, ultimately I want to model agents that made a variable number of choices. For this, I took inspiration from this question on Markov Chains and tried to make my code more general using PyTensor’s scan function:
states = [86,97,100]
actions = [1,0,0]
agent_model = pm.Model()
with agent_model:
actions = pt.as_tensor_variable([actions], dtype="int32")
states = pt.as_tensor_variable(states, dtype="int32")
alpha = pm.Uniform(name="alpha", lower=0., upper=1.)
U = pm.Uniform("U", lower=[0.]*POSITIONS, upper=[1.]*POSITIONS)
# random and logp are defined as above.
def choice(next_choice, state, U, alpha):
return pm.CustomDist.dist(
state,
U,
next_choice,
alpha,
logp=choice_logp,
random=choice_random,
class_name="choice",
dtype="int32"
)
output, updates = pytensor.scan(fn=choice,
outputs_info=[pt.ones_like(states)],
sequences=[states],
non_sequences=[U, alpha],
n_steps=3)
agent_model.register_rv(output, name="choice", observed=actions)
The graph of the model now looks like this:

However, when I run pm.sample(), I get the following error:
/usr/local/lib/python3.10/dist-packages/pytensor/tensor/basic.py in _as_tensor_Variable(x, name, ndim, **kwargs)
99 def _as_tensor_Variable(x, name, ndim, **kwargs):
100 if not isinstance(x.type, TensorType):
--> 101 raise TypeError(
102 f"Tensor type field must be a TensorType; found {type(x.type)}."
103 )
TypeError: Tensor type field must be a TensorType; found <class 'pytensor.tensor.random.type.RandomGeneratorType'>.
Further, from printing the inputs to choice_logp() and expected_utility(), it also looks like they were called with <TensorType(int32, (10,))> as the value of action. However, 10 is not a valid value for action, as it has to be either 0 or 1.
Moreover, pm.sample_prior_predictive(100) gives the following error:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
During handling of the above exception, another exception occurred:
ValueError Traceback (most recent call last)
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
Apply node that caused the error: CustomDist_choice_rv{0, (0, 0, 0, 0), int32, False}(*2-<RandomGeneratorType>, TensorConstant{[]}, TensorConstant{3}, *1-<TensorType(int32, (3,))>, *3-<TensorType(float64, (10,))>, *0-<TensorType(int32, ())>, *4-<TensorType(float64, ())>)
Toposort index: 0
Inputs types: [RandomGeneratorType, TensorType(int64, (0,)), TensorType(int64, ()), TensorType(int32, (3,)), TensorType(float64, (10,)), TensorType(int32, ()), TensorType(float64, ())]
Inputs shapes: ['No shapes', (0,), (), (3,), (10,), (), ()]
Inputs strides: ['No strides', (8,), (), (4,), (8,), (), ()]
Inputs values: [Generator(PCG64) at 0x7F684AF83AE0, array([], dtype=int64), array(3), array([1, 1, 1], dtype=int32), 'not shown', array(86, dtype=int32), array(0.55775481)]
Outputs clients: [[], ['output']]
Backtrace when the node is created (use PyTensor flag traceback__limit=N to make it longer):
File "/usr/local/lib/python3.10/dist-packages/IPython/core/interactiveshell.py", line 3553, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-63-1e78187dfaf3>", line 26, in <cell line: 6>
output, updates = pytensor.scan(fn=choice,
File "/usr/local/lib/python3.10/dist-packages/pytensor/scan/basic.py", line 852, in scan
raw_inner_outputs = fn(*args)
File "<ipython-input-63-1e78187dfaf3>", line 14, in choice
d = pm.CustomDist.dist(
File "/usr/local/lib/python3.10/dist-packages/pymc/distributions/distribution.py", line 1034, in dist
return _CustomDist.dist(
File "/usr/local/lib/python3.10/dist-packages/pymc/distributions/distribution.py", line 524, in dist
return super().dist(
File "/usr/local/lib/python3.10/dist-packages/pymc/distributions/distribution.py", line 389, in dist
rv_out = cls.rv_op(*dist_params, size=create_size, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/pymc/distributions/distribution.py", line 579, in rv_op
return rv_op(*dist_params, **kwargs)
HINT: Use the PyTensor flag `exception_verbosity=high` for a debug print-out and storage map footprint of this Apply node.
During handling of the above exception, another exception occurred:
ValueError Traceback (most recent call last)
<ipython-input-62-9176fb444312> in expected_utility(action, state, U, next_choice)
1 def expected_utility(action, state, U, next_choice):
2 print(f"calculating eu of action {action} in state {state}")
----> 3 if is_terminal_state(state):
4 return 0
5 u = U[state % 10]
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
Apply node that caused the error: CustomDist_choice_rv{0, (0, 0, 0, 0), int32, False}(*2-<RandomGeneratorType>, TensorConstant{[]}, TensorConstant{3}, *1-<TensorType(int32, (3,))>, *3-<TensorType(float64, (10,))>, *0-<TensorType(int32, ())>, *4-<TensorType(float64, ())>)
Toposort index: 0
Inputs types: [RandomGeneratorType, TensorType(int64, (0,)), TensorType(int64, ()), TensorType(int32, (3,)), TensorType(float64, (10,)), TensorType(int32, ()), TensorType(float64, ())]
Inputs shapes: ['No shapes', (0,), (), (3,), (10,), (), ()]
Inputs strides: ['No strides', (8,), (), (4,), (8,), (), ()]
Inputs values: [Generator(PCG64) at 0x7F684AF83AE0, array([], dtype=int64), array(3), array([1, 1, 1], dtype=int32), 'not shown', array(86, dtype=int32), array(0.55775481)]
Outputs clients: [[], ['output']]
Backtrace when the node is created (use PyTensor flag traceback__limit=N to make it longer):
File "/usr/local/lib/python3.10/dist-packages/IPython/core/interactiveshell.py", line 3553, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-63-1e78187dfaf3>", line 26, in <cell line: 6>
output, updates = pytensor.scan(fn=choice,
File "/usr/local/lib/python3.10/dist-packages/pytensor/scan/basic.py", line 852, in scan
raw_inner_outputs = fn(*args)
File "<ipython-input-63-1e78187dfaf3>", line 14, in choice
d = pm.CustomDist.dist(
File "/usr/local/lib/python3.10/dist-packages/pymc/distributions/distribution.py", line 1034, in dist
return _CustomDist.dist(
File "/usr/local/lib/python3.10/dist-packages/pymc/distributions/distribution.py", line 524, in dist
return super().dist(
File "/usr/local/lib/python3.10/dist-packages/pymc/distributions/distribution.py", line 389, in dist
rv_out = cls.rv_op(*dist_params, size=create_size, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/pymc/distributions/distribution.py", line 579, in rv_op
return rv_op(*dist_params, **kwargs)
HINT: Use the PyTensor flag `exception_verbosity=high` for a debug print-out and storage map footprint of this Apply node.
Apply node that caused the error: forall_inplace,cpu,scan_fn}(TensorConstant{3}, TensorConstant{[ 86 97 100]}, IncSubtensor{InplaceSet;:int64:}.0, RandomGeneratorSharedVariable(<Generator(PCG64) at 0x7F684AF83AE0>), U, alpha)
Toposort index: 4
Inputs types: [TensorType(int8, ()), TensorType(int32, (3,)), TensorType(int32, (3, 3)), RandomGeneratorType, TensorType(float64, (10,)), TensorType(float64, ())]
Inputs shapes: [(), (3,), (3, 3), 'No shapes', (10,), ()]
Inputs strides: [(), (4,), (12, 4), 'No strides', (8,), ()]
Inputs values: [array(3, dtype=int8), array([ 86, 97, 100], dtype=int32), 'not shown', Generator(PCG64) at 0x7F684AF83AE0, 'not shown', array(0.55775481)]
Outputs clients: [['output']]
HINT: Re-running with most PyTensor optimizations disabled could provide a back-trace showing when this node was created. This can be done by setting the PyTensor flag 'optimizer=fast_compile'. If that does not work, PyTensor optimizations can be disabled with 'optimizer=None'.
HINT: Use the PyTensor flag `exception_verbosity=high` for a debug print-out and storage map footprint of this Apply node.
It seems that somehow choice_random() is called with [1, 1, 1] as the value for state and that this causes the “truth value of an array with more than one element is ambiguous” error when state is compared to the terminal state.
In both cases, I do not know what could cause this behavior. Presumably, I am doing something wrong with how I use scan, since the case with a fixed number of choices worked. Any ideas on what could cause these errors would be appreciated. Thanks!