Learning Poisson State Space Model parameters for large number of samples

Thanks a lot for your suggestion. I spent some time working on the code and trying to make it work for ADVI. The below code runs fairly fast and in a reasonable time and accuracy.

with pm.Model() as mod:

    sd_dist = pm.Exponential.dist(1.0, shape=latent_variables)
    mu1 = np.zeros(latent_variables)
    
    chol1, corr, stds = pm.LKJCholeskyCov('chol_cov1', n=latent_variables, eta=2,
    sd_dist=sd_dist, compute_corr=True)
    pred_Z = pm.MvNormal('pred_Z', mu=mu1, chol=chol1, shape=(latent_variables))
    
    mu2 = np.zeros(num_features)
    chol2, corr, stds = pm.LKJCholeskyCov('chol_cov2', n=num_features, eta=2,
    sd_dist=sd_dist, compute_corr=True)
    pred_B = pm.MvNormal('B', mu=mu2, chol=chol2, shape=(num_features))
    
    A_pt = pt.as_tensor_variable(A_block)

    sigmas_Q = pm.HalfNormal('sigmas_Q', sigma=1, shape= (latent_variables))
    Q = pt.diag(sigmas_Q)
    u = pm.MutableData("input", u_scale)
    #x0 = pm.MutableData("x_init", x_0)
    data = pm.MutableData("data", data_train)
    samples = u.shape[0]
    x0 = pt.zeros((num_samples , latent_variables))
    def step(x, A, Q, samples):
        innov = pm.MvNormal.dist(mu=0, tau=Q, size = (samples))
        
        next_x = pt.nlinalg.matrix_dot(x,A) + innov
        
        return next_x, collect_default_updates([x, A, Q, samples], [next_x])
    
    hidden_states_pt, updates = pytensor.scan(step, 
                                              outputs_info=[x0], 
                                              non_sequences=[A_pt, Q, samples],
                                              n_steps=T, 
                                              strict=True)
    
    mod.register_rv(hidden_states_pt, name='hidden_states', initval=pt.zeros((T, samples, latent_variables)))
    
    
    #pred_Z = pm.HalfNormal('pred_Z', sigma=1, size= latent_variables)
    temp1 = pt.transpose(pt.nlinalg.matrix_dot(u,pred_B))
    
    #hidden_states_pt = pt.reshape(hidden_states_pt,(T,num_samples,latent_variables))
    #lambdas = pt.exp(hidden_states_pt[:, np.arange(0, num_samples * latent_variables, 2)]+temp1)
    lambdas = pt.exp(pt.squeeze(pt.nlinalg.matrix_dot(hidden_states_pt,pred_Z))+temp1)

    obs = pm.Poisson('obs', lambdas, observed=data)

I reduced the size of Q since all the samples would have the same Q and made some changes to have more interpretable code. Let me know what you think about the code.

I have implemented the things in the paper but it does not contain state space model with exogenous variables, which is why I am working on the below structure

X_{i,t+1} \sim N(\mu_t, Q)\\ \mu_{i,t}= Ax_{i,t} + BU_{i}\\ y_{i,t} \sim Poisson(\lambda_{i,t}) \\ \lambda_{i,t} = C \exp(X_{i,t})

where i is the ith sample and U is the exogenous variables. This is the most basic thing that I have to make it work, and then I have to keep on improving the model because some exogenous variables will be time-dependent and time-independent, and things like that. So the idea of sharing the paper was to show where the above state space model idea was coming from.