Naive bayes multivariate HMM

Hello,

I’m new to PyMC3 and want to adopt the framework for one of the problems I’m working on. I need to construct a discrete HMM with the emission distribution is the product of independent feature distribution. One particular issue I have right now is dimension mismatch of the individual distribution for each feature. Here is what I tried so far
from numpy.random import seed, rand
seed(1)

features = np.array([[0,1,0],[2,0,1],[1,0,2],[1,2,1],[2,1,3],[2,0,1],[2,1,1]])
N_states = 3
# Pe = 
feature_dist = []
                    
with pm.Model() as model:
    for i, feature in enumerate(features.T):
        N_labels = len(np.unique(features[:,i]))
        feature_dist.append(pm.Dirichlet('P_emission_'+str(i),
                           a=np.ones((N_states, N_labels)),
                           shape=(N_states, N_labels),
                           testval=rand(N_states, N_labels)))
        
feature_pm = feature_dist[0]
for dist in feature_dist[1:]:
    feature_pm = feature_pm*dist 

Which returns the following error
ValueError: Input dimension mis-match. (input[0].shape[1] = 3, input[1].shape[1] = 4)

I worked with the pomegranate library before, and the IndependentComponentsDistribution address this problem really well. I want to move to PyMC3 to utilize the inference algorithm it offers.

I have also tried the pm.Mixture distribution but it does not look like it is applicable here. What is the best way to set it up for this case? I have read How to marginalized Hidden Markov Model with categorical? as a reference but I still can’t figure it out. Any pointer is welcome. Many thanks!

Hi,
I think @brandonwillard will be able to give you some pointers here

Our package pymc3-hmm has some custom PyMC3 Distributions and sampling methods for HMMs that do all the heavy shape lifting, so give that a try.

1 Like