Missing data: np.nan in a multidimensional Poisson distribution

In the following model, the observed variable AD is 2-dimensional and Poisson distributed. Each row of AD corresponds to 1 “position” and is associated with 1 AF. The number of positions (nPos) is fixed. Each position is associated with multiple “samples” which correspond to the columns (nSample) and the number of samples observed varies among positions. The ‘missing’ samples are np.nan in the input array. So in a way, the samples are pooled at the ‘position’ level and the positions are pooled at the AF level so that AF for positions are sampled from a single Gamma distribution. TD is an observed covariate that has the same dimensioan as AD and np.nan in exactly the same coordinates in the matrix.
Now, with missing data filled as np.nan, inference gave an error ‘SamplingError: Initial evaluation of model at starting point failed!’ with -inf likelihood for AD, which I assume to be caused by the np.nan. My question is, without imputation, how do I rewrite the model to make it work?

with pm.Model() as m:
        TD_observed = pm.MutableData('TD_observed', np.array(dataDict['TD_observed']))  # (nPos, nSample)
        AD_observed = pm.MutableData('AD_observed', np.array(dataDict['AD_observed']))  # (nPos, nSample)

        gammaMu = pm.LogNormal("gammaMu", mu=1, sigma=1)  # (1,)
        gammaStd = pm.LogNormal("gammaStd", mu=1, sigma=1)  # (1,)
        gammaShape = pm.Deterministic('gammaShape', gammaMu ** 2 / gammaStd ** 2)  # (1,)
        gammaBeta = pm.Deterministic('gammaBeta', (gammaShape / gammaMu))  # (1,)
        AF = pm.Gamma('AF', alpha=gammaShape, beta=gammaBeta, shape=(dataDict['nPos'], 1))  # (nPos, 1)

        AF_coeff = .0001
        lambda_p = pm.Deterministic('lambda_p', TD_observed * AF * AF_coeff)  # (nPos, nSample) * (nPos, 1) > (nPos, nSample)
        AD = pm.Poisson('AD', mu=lambda_p, observed=AD_observed)  # (nPos, nSample)

Any help is appreciated.

The problem is that you are using MutableData for the observed. Imputation only works with constant data, because we need to know which RVs to create when building the model.

Automatic imputation is a shortcut for creating unobserved variables with the same distribution as the observed ones (plus a Deterministic that joins the two together for prior/posterior predictive):

The following are equivalent (but not exactly what we do with automatic imputation):

pm.Normal("x", 0, 1, observed=[0, 1, 2, np.nan, np.nan])
x_obs = pm.Normal("x_obs", 0, 1, observed=[0, 1, 2])
x_missing = pm.Normal("x_missing", 0, 1, shape=(2,))
x = pm.Deterministic("x", at.concatenate([x_obs, x_missing])

We might try to implement Mutable imputation in the future, but that is not in the roadmap at the moment.

Thank you for the answer!
Not sure if I understood it correcly, but after changing the model to the following

    with pm.Model() as m:

        gammaMu = pm.LogNormal("gammaMu", mu=1, sigma=1)  # (1,)
        gammaStd = pm.LogNormal("gammaStd", mu=1, sigma=1)  # (1,)
        gammaShape = pm.Deterministic('gammaShape', gammaMu ** 2 / gammaStd ** 2)  # (1,)
        gammaBeta = pm.Deterministic('gammaBeta', (gammaShape / gammaMu))  # (1,)
        AF = pm.Gamma('AF', alpha=gammaShape, beta=gammaBeta, shape=(dataDict['nPos'], 1))  # (nPos, 1)

        AF_coeff = .0001
        lambda_p = pm.Deterministic('lambda_p', dataDict['TD_observed'] * AF * AF_coeff)  # (nPos, nSample) * (nPos, 1) > (nPos, nSample)
        AD = pm.Poisson('AD', mu=lambda_p, observed=dataDict['AD_observed'])  # (nPos, nSample)

imputation seemed to work (an imputation warning raised). However, a similar error was raised still:

SamplingError: Initial evaluation of model at starting point failed!
Starting values:
{'gammaMu_log__': array(1.5), 'gammaStd_log__': array(1.5), 'AF_log__': array([[1.5],
       [1.5],
       [1.5],
       [1.5],
       [1.5],
       [1.5],
       [1.5],
       [1.5],
       [1.5],
       [1.5],
       [1.5],
       [1.5],
       [1.5],
       [1.5],
       [1.5],
       [1.5],
       [1.5],
       [1.5],
       [1.5],
       [1.5]]), 'AD_missing': array([-9223372036854775808, -9223372036854775808, -9223372036854775808,
       -9223372036854775808])}

Initial evaluation results:
{'gammaMu': -1.04, 'gammaStd': -1.04, 'AF': -20.0, 'AD_missing': -inf, 'AD_observed': -969.88}

Regarding this special case where a covariate TD also has missing data, since missing ADs are to be imputed from the Poisson, which relies partially on TD, is this causing the imputation failure? Or the more direct question, how is the TD imputated internally?

You are getting some insane values for AD_missing, suggesting your prior on lambda_p may be too large (and overflow?)

Some updates and a follow-up question. I have changed the model to use a mask and it worked. Specifically, I filled missing observed AD and covariates TD with redundant ones and applied a 0/1 array mask at the last step. My understanding is that, since the covariates TD have no prior distribution, there is no way to imputate it and using mask is the only way to this type of problem. In addition, now I need to predict on hold-out data, meaning I have to change the input to MutableData. Once again, it seems using mask is the only way out. Is that right? @ricardoV94

You can also try to imputate covariates, but it’s a model decision. There’s a good talk on this: Partial Missing Multivariate Observation and What to Do With Them by Junpeng Lao

Again, thank you very much.