That is strange, what versions are you using?
As a workaround, if you only use stateActions
for indexing, can avoid making them arguments of the lambda, like you do with params. Then use only rewards
as observed (only one observed argument is necessary) which won’t need to be a dict then. Again, there are many ways to do exactly the same thing with a densitydist, as shown in Using a random variable as observed - #5 by OriolAbril