Efficiently generating samples consistent with observed variables to answer conditional queries

Hi @drbenvincent, thanks for your reply! I took some time to read more on causal inference, and I’m clear now that I’m looking for posterior distribution under observations, not counterfactuals.

Specifically, the problem I am studying is maximizing information gain under a budget. I want to exploit the information structure in the problem using a connected graph and define an adaptive selection policy. Here’s an illustration.

Consider a set of sensors S = \{S_1, S_2...S_n\} each providing some measurements of interest. The sensors form a DAG where each sensor has some parents and could itself be parents of other sensors. Each sensor depends on readings from its parents to work correctly. Each sensor also has an independent probability of failure ps_{i}. Therefore, a sensor can fail either due to its own independent failure, or if any of its parents have failed. One by one, I can select a sensor to probe and check if its working correctly. We can assume that the probing action yields a precise measurement. I have to maximize my knowledge about the working status of sensors in minimal number of probes. I am modeling the failure probability of sensors as a noisy AND of its parents plus its own failure of probability.

Thus, I have Bernoulli sensor states (could be more states in a more complicated version of the problem) and a well-defined description of all the relationships between children and their parents, i.e., P(X|Parents(X). The adaptive information selection would rely on the objective of selecting nodes that provide the highest reduction in entropy of the knowledge of states over the graph. Intuitively, if I pick a node lower in the DAG, and it is observed to be failed, I can tell that its descendants won’t work, but I still have to explore its parents.

I see that you mentioned that pyMC excels for continuous type variables, but I ran some small scale experiments on toy graphs (~DAG with 6 nodes and bernoulli states). I defined some dependency between the nodes, and could see a differential value of working state uncertainty reduction over the graph on observing different nodes, depending on the node’s position in the graph. However, I am yet to see pyMC will work for a realistic graph in my application. Any thoughts appreciated! :slight_smile: