Cascade Hierarchical Model


I am hoping to get some help on how to program a particular model in pymc. It is a two stage logistic regression hierarchical cascade where:

Event A has a probability of occurring and if it does, then there is some probability of Event B happening. If Event A does not happen then there is no possibility of Event B happening.

I have samples where B has occurred, where A has occurred but not B and where A has not occurred. One thing to note is that this is not a balanced dataset

Some ideas I had:

  1. The second level, B, probability is the multiplication of A’s logistic probability and B’s logistic probability
  2. The second level’s training data should only be those events which A was observed

Any help would be greatly appreciated. Thank you.

I had an insight after I created this post. Instead of using a double Bernoulli, I just use a multinomial with probabilities:

[p_A * p_B, p_A * (1-p_B), (1-p_A)]

Due to class imbalance I have to under-sample but I correct the probabilities using Bayes’ rule using
Dal Pozzolo 2015. Hopefully this helps someone in the future