Hierarchical Bernoulli model for machine fault rate

I see a few potential problems.

  1. It looks to me like you’re sharing a prior on {key}_logit:mu and {key}_logit:sd across all stages. Is this intentional? You may want to create these priors per stage instead.
  2. You’re treating this essentially as hierarchical logistic regression, right? You may need to enforce something like sum-to-zero constraints on the coefficients. This notebook has some examples that might help you get started.
  3. (edit) Also, on further thought, I’m not sure using \frac{e}{1 + e} is the right value for the prior - should probably just be 0, and may want to lose the hyperprior {key}_mu entirely.
  4. (edit) There’s also no intercept here - that might be a helpful thing to include.
with pm.Model() as model:
    deterministics = {}
    dists = {}
    thetas = []

    for key in data.keys():
        if key != target:
            # Edit: may want to replace this with mu = 0
            mu = pm.StudentT(f'{key}_mu', nu=1, mu=0, sd=1)
            sig = pm.HalfStudentT(f'{key}_sig', sd=1)
            dists[key] = pm.Normal(f'{key}_logit', mu=mu, sd=sig, shape=df[key].nunique())
            deterministics[key] = pm.Deterministic(key, pm.invlogit(dists[key]))

            thetas.append(dists[key][df[key].values])

    intercept = pm.Normal('intercept', mu=0, sd=1)

    theta = intercept + T.sum(thetas, axis=0)
    
    # likelihood of observed data
    fault = pm.Bernoulli(target, p=pm.invlogit(theta), observed=df[target].values)