 # PyMC3 Conditioning Random Variable on Multiple Discrete Parents

#1

I recently started doing probabilistic programming using PyMC3. In my scenario, I have 3 random variables: On, Triangle, and X, such that X depends on Triangle and On. Triangle and On both follow Bernoulli distributions, and depending on which value they take, the value of X, which follows a Normal, changes.

I wrote up some mock code to test this concept but it doesn’t work. I just started working in this framework, and I know there are potentially many things wrong with this code, but I’m posting this here so you can see what I’ve done.

``````with pymc3.Model() as model:
on = pymc3.distributions.discrete.Bernoulli('on', p=.7)
x_on = pymc3.distributions.continuous.Normal('x_on', 10, .4)

pTriangle_given_on = 1
pTriangle_given_not_on = .7
pTriangle = pymc3.math.switch(on, pTriangle_given_on, pTriangle_given_not_on)
triangle = pymc3.distributions.discrete.Bernoulli('triangle', p=pTriangle)

name = None
x_triangle = None

if triangle:
name = pymc3.distributions.discrete.Categorical('name', p=[.3, .2, .1, .1, .2, .1])
else:
name = pymc3.distributions.discrete.Categorical('name', p=[.1, .5, .4])

if on:
x_triangle = pymc3.Deterministic('x_triangle', x_on)
elif triangle:
x_triangle = pymc3.Normal('x_triangle', 5, 1)
else:
x_triangle = numpy.nan

trace = pymc3.sample()
pymc3.plot_posterior(trace)
``````

I’m not sure how to specify the conditional dependence of X on Triangle and On. Any thoughts from you all would be much appreciated.

Condition Categorical Variable on Bernoulli Parents
#2

As a caveat: I’m not sure what the “proper” way to handle NA-valued outputs is, but below is one way.

Hello and welcome. The core reason the above code doesn’t work is that what you want the if statements to mean is "If the state of `on` is `1`"; but what they’re really asking is "If the variable `on` is not `None`".

Ultimately, the value of a random variable is not available until the sampler has started running, so you can’t directly use `if` on them. The `switch` statement you have is the appropriate one to use; and you could re-write all of your code to use `switch` and it would work just fine.

However, when dealing with Bernoulli distributions, it is often more efficient to exploit the fact that multiplication works effectively as a switch. For instance, the (incorrect) code

``````if on:
return 5
else:
return 2
``````

could be implemented correctly as

`2 + on * 3`

With that in mind, the joint distribution could be defined with:

``````import pymc3 as pm

p_on = 0.3
p_tri_giv_on = 1.
p_tri_giv_not_on = 0.7
tri_delta_on = p_tri_giv_on - p_tri_giv_not_on

mean_on, sd_on = 10., 0.4
mean_tri, sd_tri = 5., 1.
NA_ENCODING = -10.
with pm.Model() as model:
on = pm.Bernoulli('on', p_on)
triangle = pm.Bernoulli('triangle', p_tri_giv_not_on + on * tri_delta_on)
x_lat = pm.Normal('x_l', 0., 1.)  # will transform with `on` and `triangle`
x = pm.Deterministic('x', NA_ENCODING + \
on * (-NA_ENCODING + x_lat * sd_on + mean_on) + \
(1 - on)*triangle*(-NA_ENCODING + x_lat * sd_tri + mean_tri))
res = pm.sample_prior_predictive(1000)

import seaborn as sbn
import pandas as pd
from matplotlib import pyplot as plt
rdf = pd.DataFrame(res)
rdf['sample_num'] = rdf.index

sbn.pointplot(x='sample_num', y='x', data=rdf, join=False)
`````` 1 Like
#3

Thank you @chartl !

#4

Quick question @chartl Why is the quoted value 10, rather than 20? Because I see -(-10) + 0 + 10 = 20.

#5

The first term of x is (positive) `NA_encoding`, which is the default value (i.e. for `on=0` and `triangle=0`). When on=1 then you get

`NA_encoding + (-NA_ENCODING + x_lat*sd_on + mean_on) = x_lat*sd_on + mean_on`

#6

Ahh ok, yes I see now! Thanks!