# Condition Categorical Variable on Bernoulli Parents

I recently asked a question about how to condition a normally distributed variable on bernoulli parents. After looking at the code, and after some discussion, I understood what was going on.

In my setting, the dependent variable either takes the form of a particular distribution given the parents, or it should take the NA value (or its equivalent). So, in this case, suppose that I have two Bernoulli variables, On and Triangle, and I have one Categorical variable, Name, that depends on both of these. If On is true, then Name follows a categorical distribution determined by On. If Triangle is true, Name takes another categorical distribution determined by Triangle. In the case where both On and Triangle are False, Name should essentially remain undefined.

The following is mock code:

``````tri_names_given_on = numpy.array([.1, .5, .2, .2])
tri_names_given_triangle = numpy.array([.1, .1, .1, .1, .1, .3, .3])

block_names_given_on = numpy.array([.3, .4, .3])
block_names_given_block = numpy.array([.1, .2, .3, .4])

NA_ENCODING = -10.

with pymc3.Model() as model:

on = pymc3.Bernoulli('on', pOn)
triangle = pymc3.Bernoulli('triangle', pTri_given_not_on + on * tri_delta_on)
block = pymc3.Bernoulli('block', pBlock_given_not_on + on* block_delta_on)
arg1 = NA_ENCODING + on * (-NA_ENCODING + tri_names_given_on) + (1 - on) * triangle * (-NA_ENCODING + tri_names_given_triangle)
arg2 = NA_ENCODING + on * (-NA_ENCODING + block_names_given_on) + (1 - on) * block * (-NA_ENCODING + block_names_given_block)

triangle_name = None
block_name = None

if arg1 != NA_ENCODING:
triangle_name = pymc3.Categorical('triangle_name', arg1)
else:
triangle_name = pymc3.Deterministic('triangle_name', NA_ENCODING)

if arg2 != NA_ENCODING:
block_name = pymc3.Categorical('block_name', arg2)
else:
block_name = pymc3.Deterministic('block_name', NA_ENCODING)
``````

I’ve been having difficulty specifying my model in PyMC, so I think I’m missing some fundamental knowledge. I’ve looked at some of the tutorials, but right now, they don’t seem comprehensive. I seem to have a lot of gaps in my ability. So, any help is greatly appreciated.

You should have a look at marginalized mixture model - whenever you have discrete variables in your model, you should first try to find way to marginalized it

Frequently Asked Questions is a good place to start.

Ok, I will look into this! Thank you

1 Like

@junpenglao, I came up with this. Please let me know what you think.

``````tri_name_giv_tri_on_dist = numpy.array([.4, .4, .2])
tri_name_giv_tri_not_on_dist = numpy.array([.2,.3, .5])

block_name_giv_block_on_dist = numpy.array([.3, .3, .4])
block_name_giv_block_not_on_dist = numpy.array([.1, .3, .6])

NA_ENCODING = -10.

with pymc3.Model() as model:

# Internal NA random variable
NA = pymc3.Deterministic('NA', NA_ENCODING)

on = pymc3.Bernoulli('on', pOn)
triangle = pymc3.Bernoulli('triangle', pTri_given_not_on + on * tri_delta_on)
block = pymc3.Bernoulli('block', pBlock_given_not_on + on* block_delta_on)

triangle_mixture_weights = numpy.array([on * triangle, (1 - on) * triangle, (1 - on) * (1 - triangle)])
tri_name_given_tri_and_on = pymc3.Categorical.dist('tri_name_given_tri_and_on', tri_name_giv_tri_on_dist)
tri_name_given_tri_and_not_on = pymc3.Categorical.dist('tri_name_given_tri_and_not_on', tri_name_giv_tri_not_on_dist)
triangle_name = pymc3.mixture('triangle_name' w=triangle_mixture_weights, comp_dists=[tri_name_given_tri_and_on, tri_name_given_tri_and_not_on, NA])

block_mixture_weights = numpy.array([on * block, (1 - on) * block, (1 - on) * (1 - block)])
block_name_given_block_and_on = pymc3.Categorical.dist('block_name_given_block_and_on', block_name_giv_block_on_dist)
block_name_given_block_and_not_on = pymc3.Categorical.dist('block_name_given_block_and_not_on', block_name_giv_block_not_on_dist)
block_name = pymc3.mixture('block_name' w=block_mixture_weights, comp_dists=[block_name_given_block_and_on, block_name_given_block_and_not_on, NA])``````

You are getting there! You should rewrite the following into a continuous variable, either use the parameters (i.e., `pOn`, `pTri_given_not_on + on * tri_delta_on` here) directly, or wrap it into a Beta distribution:

``````   on = pymc3.Bernoulli('on', pOn)
triangle = pymc3.Bernoulli('triangle', pTri_given_not_on + on * tri_delta_on)
block = pymc3.Bernoulli('block', pBlock_given_not_on + on* block_delta_on)
``````

Into:

``````   on = pOn
triangle = pTri_given_not_on + on * tri_delta_on
block = pBlock_given_not_on + on* block_delta_on
``````

@junpenglao, So, you’re saying I should do something like this?

``````   # Internal x,y,z position variable to be transformed
pos = pymc3.Normal('pos', 0., 1., shape=3)

# Internal NA random variable
NA = pymc3.Deterministic('NA', NA_ENCODING)

# On
pOn = pymc3.Beta('on', alpha=on_count, beta = schema_count - on_count);
on = pymc3.Bernoulli('on', pOn)

# Triangle
triangle_mixture_weights = np.array([on, (1 - on)])
tri_giv_on = pymc3.Bernoulli.dist('tri_giv_on', pTri_given_not_on + tri_delta_on)
tri_giv_not_on = pymc3.Bernoulli.dist('tri_giv_not_on', pTri_given_not_on)
triangle = pymc3.Mixture('triangle', w=triangle_mixture_weights, comp_dists=[tri_giv_on, tri_giv_not_on])

triangle_name_mixture_weights = np.array([on * triangle, (1 - on) * triangle, (1 - on) * (1 - triangle)])
tri_name_given_tri_and_on = pymc3.Categorical.dist('tri_name_given_tri_and_on', tri_name_giv_tri_on_dist)
tri_name_given_tri_and_not_on = pymc3.Categorical.dist('tri_name_given_tri_and_not_on', tri_name_giv_tri_not_on_dist)
triangle_name = pymc3.Mixture('triangle_name', w=triangle_name_mixture_weights, comp_dists=[tri_name_given_tri_and_on, tri_name_given_tri_and_not_on, NA])

x1 = pymc3.Deterministic('x1', NA_ENCODING + on * (-NA_ENCODING + pos[0] * std_x1_on + mu_x1_on) + \
(1 - on) * triangle * (-NA_ENCODING + pos[0] * std_x1_tri + mu_x_tri))
y1 = pymc3.Deterministic('y1', NA_ENCODING + on * (-NA_ENCODING + pos[1] * std_y1_on + mu_y1_on) + \
(1 - on) * triangle * (-NA_ENCODING + pos[1] * std_y1_tri + mu_y_tri))
z1 = pymc3.Deterministic('z1', NA_ENCODING + on * (-NA_ENCODING + pos[2] * std_z1_on + mu_z1_on) + \
(1 - on) * triangle * (-NA_ENCODING + pos[2] * std_z1_tri + mu_z_tri))
``````

And similarly for the block?

Nope - I mean try to avoid something like `on = pymc3.Bernoulli('on', pOn)` as it is an unobserved discrete variable.

Ah, Ok, I see… You are suggesting that I marginalize it out, right?

yep

So, I’m not sure what to do about marginalization. In my setting, I will want to infer the values of these hidden variables. If they’re marginalized out like that, I won’t be able to do this, right?

You can infer the continuous mixture weight instead - if you want explicit latent label you can sample from the mixture weight with a categorical, or applying argmax.

If you have some resources you could point me to, I’d appreciate that. I’m not sure I follow 100% what you’re suggesting that I do.

Also, in my setting,it is possible to mark some of the hidden variables as observed. For example, someone might say that `on` is true, while being interested in the values for `triangle` and `block`.

You have seen the answer in FAQ right? You can also have a look at this notebook: https://github.com/junpenglao/advance-bayesian-modelling-with-PyMC3/blob/master/Notebooks/Code10%20-%20Schizophrenic_case_study.ipynb

Hope it gives you some inspirations!

For the most part I understood your example, but I couldn’t grasp what was happening when you wrote:

``````Z_latent = pm.Uniform('Z_latent', 0., 1., shape=(6, Nt))
Z = pm.Deterministic('Z',
pm.theanof.tt_rng().binomial(
n=1, p=Z_latent, size=(6, Nt)))
``````

My guess is that you’re marginalizing `Z`, but it’s not clear to me. `Z_latent` will always be `1`, and you’re feeding that into some theano binomial distribution parameterized by one trial and probability of success, `p=Z_latent=1`?

I looked up pm.theanof.tt_rng(), but wasn’t able to find clear information on it.

Yeah that example is a bit convoluted - it is more to show what it is possible but I would not recommend it - the other two ways of modeling it is probability what you should focus on

Hey, I think I’ve made a lot of progress on my problem. Do you have any examples, you can point me to about inferring the mixture weights?

Do you mean doing something like this:

``````#Mixture weights
pOns = pymc3.Dirichlet('pOns', numpy.array(on_alphas))

on = pymc3.Categorical('on' pOns)
``````

But for the mixture model, I would parameterize it using `pOns`, rather than `on`? Like:

``````triangle = pymc3.Mixture('triangle', w = pOns, \
comp_dists=[tri_giv_not_on, tri_giv_on], \
testval=1, dtype="int64", observed=1)
``````

Thanks!

Yes! That’s in general the idea of marginalization.

As for inferencing mixture, there are a few posts on the discourse you can look at. You can start with the discussion here: Properly sampling mixture models