Marginalizing over missing categories

Update:
I think I found the issue with the above model. Appears I was a bit confused about the role of the ‘shape’ parameter when specifying the observed/RV’s. This blog post from @lucianopaz really helped clarify things!

This is my updated model, which seems to sample better (~1 hr). Moreover, I can now fit with ‘advi’. There are still 1-2 divergent transitions and I haven’t given it a full run-through with posterior predictive checks, so there might still be a better way to specify things.

with pm.Model() as model2:
    mu_alpha = pm.StudentT("mu_alpha",nu=1, mu=0, sigma=100)
    sigma_alpha = pm.HalfNormal("sigma_alpha", sd=100)
    source = pm.intX(pm.Data("source", sources_encoded[observed_idx.flatten()]))
    
    a = pm.Normal("a", mu=mu_alpha, sd=sigma_alpha, shape=(n_sources,))
    p = pm.Deterministic("p", pm.math.invlogit(a[source]))
    becameLicensedObs = pm.Binomial(
        "outcomeObs", 1, p, observed=df["outcome"].values[observed_idx.flatten()],
    )
    # latent probability observation of category is missing | category
    pMiss = pm.Dirichlet("pMiss", a=np.ones(n_sources))
    # components of the mixture generataing the observations with 
    # unknown categories. This makes the assumption that the missing
    # observations come from one of the categories known to be part of source
    # and that missingness might depend on the what that category is.
    comps = pm.Binomial.dist(n=1, p=pm.math.invlogit(a), shape=(n_sources,))
    becameLicensedMissing = pm.Mixture(
        "outcomeMissing",
        comp_dists=comps,
        w=pMiss,
        observed=df["outcome"].values[missing_idx.flatten()],
    )

As a side note: generating the plate notation diagram with graphiz was really helpful in working through this problem so I thought I would include it here:

1 Like