Estimating conditional probabilities

tomkov · February 18, 2019, 8:28am

It’s beginners question…
Let’s assume we’ve got graph as in picture below:

Now you have a lot of data that look like this (all nodes are either True(T) or False(F))

   D | I | G | S | L |
--------------------
1. T | T | F | T | F |
2. T | T | T | F | F |
etc.

How would you parameterize or define conditional probability in pymc3, if you wanted to learn CPD/CPT from data?

How would you parameterize Grade dependency on both Difficulty and Intelligence?
It could be done with logistic regression where parameter of Grade distribution is defined as theta_grade = alpha + beta_1*difficulty + beta_2*intelligence + error
But i was wondering if there’s other way… I mean if you wanted to do MLE, how would you define graph in pymc3?

junpenglao · February 18, 2019, 3:50pm

Not sure I understand, if you have data like that all your nodes are observed, and you wont have things to estimate as you have no unknown.

In general, to have dependency, you make sure the parameter of Grade is some function (e.g., linear function like what you have above) of the output from Difficulty and Intelligence.

See also: Intercausal Reasoning in Bayesian Networks

tomkov · February 18, 2019, 5:31pm

I mean, I have all of them observed, but I’d like to estimate conditional probability table (or function)

chartl · February 19, 2019, 6:00pm

Really what you want to do is estimate the normalization factor (since the integrand for the conditional distribution is just the joint density), so that

P(U|V=v) = P(U, V=v)/N(v)

N(v) is of course just the integral over U. So you could get this as:

with pm.Model() as joint:
    x = pm.Normal('x', 0., 1.)
    y = pm.Normal('y', 0., 1.)
    z_l = pm.Deterministic('zl', x + y)
    z = pm.Normal('z', z_l, 1.)
    
# sample from conditional
with pm.Model() as conditional:
    # z = x + y
    x = pm.Constant('x', 1.)
    y = pm.Normal('y', 0., 1.)
    z_l = pm.Deterministic('zl', x + y)
    z = pm.Normal('z', z_l, 1.)

    cond_samples = pm.sample(1000)

# evaluate probability under the joint
N = 100
joint_lp = np.array([joint.logp({'x': cond_samples['x'][i], 
             'y': cond_samples['y'][i], 
             'z': cond_samples['z'][i]}) 
            for i in xrange(N)])

# monte carlo integration
cond_lp = np.array([conditional.logp({'x': cond_samples['x'][i], 
             'y': cond_samples['y'][i], 
             'z': cond_samples['z'][i]}) 
            for i in xrange(N)])

# MCI: 1/N sum f(x_i)/p(x_i)
Vlg = np.log(np.sum(np.exp(joint_lp)/np.exp(cond_lp))/N)
# define the (approximate) conditional distribution
p_cond = lambda d: joint.logp(d) - Vlg

I found the .logp evaluations to be terribly slow, for some reason.

junpenglao · February 19, 2019, 6:05pm

You need to do logp = joint.logp, and call logp… see some explanation in cell[8] in https://docs.pymc.io/notebooks/api_quickstart.html

tomkov · February 20, 2019, 8:58am

I’m not sure if we understood each other… i was wondering how to estimate conditional probabilites in pymc3…

And the most interesting case for me is when a node has 2 or more parents… MLE or Bayesian estimation or something like that

full slides:

junpenglao · February 20, 2019, 10:39am

I would interpret it as P(X_1) = P(X_1=1), which means P(X1) = sum(X1==1)/len(X1). Similarly, you have P(X3|X1, X2) = sum(X1==1 & X2==1 & X3==1)/sum(X1==1 & X2==1)

tomkov · February 20, 2019, 12:08pm

Those are not generated samples from Bayesian network, meaning, you can’t look at them as Joint probability.

If you have P(X3|X1, X2) = sum(X1==1 & X2==1 & X3==1)/sum(X1==1 & X2==1),
you already know how to generate sample of X3 when you know values of X1 and X2.

In this situation here, we’ve got only known graph, and data - and we want to learn how X3 depends of X1 and X2.

junpenglao · February 20, 2019, 12:28pm

In this case, since the value are binary and you have no other information, you do know how to generate sample by using a bernoulli distribution - which means you can estimated the dependence between X3 and X1+X2 by computing the proportion (that would be the MLE).

chartl · February 20, 2019, 3:01pm

And for the Bayesian version, you need to model the rate dependence explicitly, even with a simple uninformative prior like

rp = pm.Beta('r3_prior', 1., 1., shape=(4,))
rate3 = pm.Deterministic('rate3', rp[0] * X1 * X2 + \
                                      rp[1] * X1 * (1-X2) + \
                                      rp[2] * (1-X1) * X2 + \
                                      rp[3] * (1-X1)*(1-X2))
X3 = pm.Bernoulli('X3', rate3)

and the posterior distribution of rp is, I think, what you want – giving both mean and uncertainty about the rate of X3 under the various states of X1 and X2.

(My guess is that tt.ifelse may be more efficient than these multiplications)

Topic		Replies	Views
Unknown joint distribution, known conditional marginals Questions	2	743	February 27, 2019
Compute probability of parameters given data in a bayesian network Questions	4	990	June 8, 2020
Pymc3 for graphical models (PGM) Questions	5	2187	August 21, 2020
Intercausal Reasoning in Bayesian Networks Questions	5	2540	January 8, 2019
Theoretical and Practical Considerations and Questions v5 development , modeling , sampling	0	27	September 13, 2024

Estimating conditional probabilities

Related topics