I have a small question about Bayesian networks and I hope someone here might be experienced enough to help me with some suggestions.
My question is about initializing of a bayesian model where the nodes are a mix of discrete and continuous.
Starting with a simple discrete example, say we have a very simple model as:
SMOKING ------> CANCER
In the case of discrete modeling, I can specify some CPTs (something maybe generated from data using counts) as:
smoking = { 'True' : 0.5, 'False' : 0.5 }
and then the child node can have full conditional distributions as:
cancer =
[[ 'True', 'True', 0.75 ],
[ 'True', 'False', 0.25 ],
[ 'False', 'True', 0.02 ],
[ 'False', 'False', 0.98 ]], [smoking] )
For this case, I can derive this distribution from my training data and generate the CPT.
Now, I wonder how one should go about instantiating a model where the nodes are continuous. So, imagine we have something like:
HEIGHT ------> WEIGHT
Now, I have some recorded data and I can take the mean height and variance and model the height parent node as a Gaussian.
Now, I want to model the weight node and imagine I have some data with heights and corresponding weights. I guess these represent a sampling of the joint distribution and I want to represent P(Weight|Height)
as a Gaussian here. In the discrete case, I could create the CPT directly from the data. How can one do it in the case of continuous nodes?
Additional question, literature talk about using the EM algorithm to find/estimate these parameters. But I am confused about what else needs to be estimated once we have specified the conditional distributions like this.