# Hierarchical models with different number of observations

Hi PyMC developers,

I have a problem for dealing with different number of observations.
I have opened up a topic on PyTensor community, and @ricardoV94 recommended me to start this topic here.

I am using PyMC v5 for part of our biological study.
The purpose is to infer the true mutation probability from the rate of mutation of RNAs in massive sequencing data.

In our study, I would like to infer distributuion of mutation probability which is differ for genes and cell.
Our data set has N cells and M genes (matrix; N cells x M genes). In addition, each gene derives multiple; however, the number of derived RNA is different in each gene in each cells. Numbers of mutation (k) and bases (K) in each RNA is previously counted (the ragged tensor; N cells x M genes x ?? RNAs).

eg. 1) num of RNA and mutations; 2 cells x 3 genes x ?? RNAs
[[[k111, k112, k113], [k121, k122], [k131]], [[k211, k212], [k221], [k231]]]
[[[K111, K112, K113], [K121, K122], [K131]], [[K211, K212], [K221], [K231]]]

In addition, (the prior distributions for) the mutation probability (p) is mutually related, but different in each gene in each cell.

eg. 2) mutation probability; 2 cells x 3 genes
[[p11, p12, p13], [p21, p22, p23]]

In our case, the number of mutations (k) obey binomial distribution independently (Binomial(k, K; θ)). For the above condition, we would like to generate two kind of datasets for k and K (N cells x M genes x ?? RNAs for each), and calculate likelihood for each RNA under binomial distribution.

In summary, I would like to apply N x M matrix to N x M x ?? tensor using PyMC for calculating likelihoods. However, since the numbers of the RNAs (observations) are different for genes in each cell, I cannot generate tensor of k and K and got below error message.

The requested array has an inhomogeneous shape after 2 dimensions. The detected shape was (2, 3) + inhomogeneous part.

If you have any way to set matrix or tensor with different length of dimension, I would be greatly appreciated.
The version of PyMC I am using is v5.13.1.
Sorry for the long post, and thank you in advance.

Does the hierarchical model primer give the general picture? A Primer on Bayesian Methods for Multilevel Modeling — PyMC example gallery