Hidden Markov Model

I am trying to implement a particular type of a hidden Markov model in PyMC3 but I keep running into runtime errors.
Here is a minimal example of the problem I am currently stuck at:

nr_params = 2
min_nr_strategies = 1
max_nr_strategies = 10
with pm.Model() as jump_model:
#number of strategies (clusters)
#nr_strategies = pm.DiscreteUniform(‘nr_strategies’,min_nr_strategies,max_nr_strategies)
temp = pm.Categorical(‘temp’,np.ones(max_nr_strategies)/max_nr_strategies)
nr_strategies = temp +1
#The probability that the strategy changes from one time step to the next
p_change = pm.Beta(‘p_change’,1,1)
#Sample the strategy (cluster) of the first observation
S = pm.Categorical(‘S0’,(np.arange(1,max_nr_strategies)<=nr_strategies+0)/nr_strategies)
S_old = S
for i in range(1, 10):
#The random variables c1,c2,c3,… are binary indicators of whether or not the latent variables changed.
c = pm.Bernoulli(f’c{i}‘,p_change)
#Cluster of the next observation
S_new = pm.Categorical(f’S_new{i}’,(c*(np.arange(1,max_nr_strategies <=nr_strategies)+0)/nr_strategies)
S = pm.Deterministic(f’S{i}',c*S_new + (1-c)*S_old)
S_old = S
trace=pm.sample()

The Runtime Error says “ValueError: Bad initial energy: nan. The model might be misspecified.” and it arises because S1 is nan. But why is S1 nan? What am I doing wrong and how should I do it instead?

I have also tried specifying the distribution of S_{t+1} as a function of the value of S_t but when I do that, then I got runtime errors because of the expression “np.arange(0,max_nr_strategies-1)==S” which I use to create a delta distribution centered on the current value S.

This error usually indicates something wrong with your starting values, or with your priors. Could you print out the result of [x.tag.test_value for x in jump_model.vars]? Also, have you tried sampling from your priors, and see if they look reasonable?

EDIT: I’m also a bit nervous about the discrete variables in your model. Discrete variables are always tough to sample, and best practice is to marginalize it out if at all possible. I don’t know that much about HMMs, though, so maybe I’m fretting unnecessarily…

Running “[x.tag.test_value for x in jump_model.vars]” yields

[0, 0.0, 0, 0, 0]

The code I posted samples from the model’s priors. I am not even conditioning on any observations yet. The discrete variables are an important part of the model because the goal is to infer which observation belongs to which latent category. So I don’t think I can just marginalize them out. Although, if I was able to get just the marginal likelihood of this model this would also be a step forward.

Yes, that was what I was trying to get at: if there was such a thing as a marginalized HMM, that would be helpful.

As to sampling from the prior, could you try using sample_prior_predictive? That might help. Currently, sample will try to use some compound step with NUTS and Metropolis, which is unnecessary for sampling from the prior: you just need some forward sampling here, since there is no data (as you say).

When I use trace=pm.sample_prior_predictive() the code runs without error.
But when I run

for RV in jump_model.basic_RVs:
print(RV.name, RV.logp(jump_model.test_point))

Then I see that the log_p values for the S_new variables are always nan.
Do you have any ideas why the S_new variables’ log_p values are all nan and what I can do to address this problem?

Your code for inspecting the variable values

[x.tag.test_value for x in jump_model.vars]

now yields

[0, 0.0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

Furthermore, pm.traceplot(trace) throws a type error saying “TypeError: unhashable type: ‘slice’”.