So at a high level, the graph is showing how data comes about. The term that gets thrown around is “ancestral sampling”, but I like to think about it as the “flow of information” in the model.
Nodes are values. If the values is a box, it’s data (provided exogenously to the model). If it’s a circle, it means that it’s a random variable.
The coloring is also related to data. If a shape is gray, it means that it is associated with some non-random data. If it’s white, it means there’s no associated data.
So in your graph specifically, sigma
is just a value, so it’s a constant data. It’s gray because some values from outside the model are informing it. It’s square because it’s just holding that data – there’s no random variable associated with it.
Obs, on the other hand, is round and gray. This means that, on one hand it’s a random variable, but on the other hand it is conditioned on observations. People use the term “likelihood” for variables like obs
(this comes from the well-known formula for Bayes’ law, posterior = likelihood * prior), but I have beef with that. That’s neither here nor there. Maybe the best way think about it is latent vs observed. When a variable is shaded, we observe and measure it. Otherwise it’s a latent (or hidden) variable.
mu
is white and round, that means it’s a random variable with no associated data.
The graph is read top to bottom. The arrows show dependencies in the model. mu
is a root node. It doesn’t depend on anything else, so you can start the generative story of this model by taking unconditional samples from mu
. obs
depends on both mu
and sigma
. The generative story of obs
, then, is that you take a random draw from mu
, then a fixed sigma
, and use them to parameterize a normal distribution. 1000 independent draws are taken from this single mu, sigma
pair. I know this because of the box around obs
. This is called a plate, and says there are iid copies of whatever is inside the plate.
pm.Potential
terms appear on the graph as funny polygons. They don’t really fit into the generative story, but they will appear if you have them.
The arrows point from the mu to obs because that’s the order that samples are generated. You sample mu, then pass that sample into obs. The graph is telling you how to generate data, from start to finish. All of the terms you propose (B depends on A, B is conditioned on A, A parameterizes B) are valid imo.