Condition Categorical Variable on Bernoulli Parents

dmenager · April 4, 2019, 9:40pm

I recently asked a question about how to condition a normally distributed variable on bernoulli parents. After looking at the code, and after some discussion, I understood what was going on.

In my setting, the dependent variable either takes the form of a particular distribution given the parents, or it should take the NA value (or its equivalent). So, in this case, suppose that I have two Bernoulli variables, On and Triangle, and I have one Categorical variable, Name, that depends on both of these. If On is true, then Name follows a categorical distribution determined by On. If Triangle is true, Name takes another categorical distribution determined by Triangle. In the case where both On and Triangle are False, Name should essentially remain undefined.

The following is mock code:

tri_names_given_on = numpy.array([.1, .5, .2, .2])
tri_names_given_triangle = numpy.array([.1, .1, .1, .1, .1, .3, .3])

block_names_given_on = numpy.array([.3, .4, .3])
block_names_given_block = numpy.array([.1, .2, .3, .4])

NA_ENCODING = -10.

with pymc3.Model() as model:

   on = pymc3.Bernoulli('on', pOn)
   triangle = pymc3.Bernoulli('triangle', pTri_given_not_on + on * tri_delta_on)
   block = pymc3.Bernoulli('block', pBlock_given_not_on + on* block_delta_on)
   arg1 = NA_ENCODING + on * (-NA_ENCODING + tri_names_given_on) + (1 - on) * triangle * (-NA_ENCODING + tri_names_given_triangle)
   arg2 = NA_ENCODING + on * (-NA_ENCODING + block_names_given_on) + (1 - on) * block * (-NA_ENCODING + block_names_given_block)

   triangle_name = None
   block_name = None

   if arg1 != NA_ENCODING:
      triangle_name = pymc3.Categorical('triangle_name', arg1)
   else:
      triangle_name = pymc3.Deterministic('triangle_name', NA_ENCODING)

   if arg2 != NA_ENCODING:
      block_name = pymc3.Categorical('block_name', arg2)
   else:
      block_name = pymc3.Deterministic('block_name', NA_ENCODING)

I’ve been having difficulty specifying my model in PyMC, so I think I’m missing some fundamental knowledge. I’ve looked at some of the tutorials, but right now, they don’t seem comprehensive. I seem to have a lot of gaps in my ability. So, any help is greatly appreciated.

junpenglao · April 5, 2019, 4:55am

You should have a look at marginalized mixture model - whenever you have discrete variables in your model, you should first try to find way to marginalized it

Frequently Asked Questions is a good place to start.

dmenager · April 5, 2019, 5:15am

Ok, I will look into this! Thank you

dmenager · April 15, 2019, 2:45am

@junpenglao, I came up with this. Please let me know what you think.

tri_name_giv_tri_on_dist = numpy.array([.4, .4, .2])
tri_name_giv_tri_not_on_dist = numpy.array([.2,.3, .5])

block_name_giv_block_on_dist = numpy.array([.3, .3, .4])
block_name_giv_block_not_on_dist = numpy.array([.1, .3, .6])

NA_ENCODING = -10.

with pymc3.Model() as model:

   # Internal NA random variable
   NA = pymc3.Deterministic('NA', NA_ENCODING)
   
   on = pymc3.Bernoulli('on', pOn)
   triangle = pymc3.Bernoulli('triangle', pTri_given_not_on + on * tri_delta_on)
   block = pymc3.Bernoulli('block', pBlock_given_not_on + on* block_delta_on)

   triangle_mixture_weights = numpy.array([on * triangle, (1 - on) * triangle, (1 - on) * (1 - triangle)])   
   tri_name_given_tri_and_on = pymc3.Categorical.dist('tri_name_given_tri_and_on', tri_name_giv_tri_on_dist)
   tri_name_given_tri_and_not_on = pymc3.Categorical.dist('tri_name_given_tri_and_not_on', tri_name_giv_tri_not_on_dist)
   triangle_name = pymc3.mixture('triangle_name' w=triangle_mixture_weights, comp_dists=[tri_name_given_tri_and_on, tri_name_given_tri_and_not_on, NA])
  
   block_mixture_weights = numpy.array([on * block, (1 - on) * block, (1 - on) * (1 - block)])   
   block_name_given_block_and_on = pymc3.Categorical.dist('block_name_given_block_and_on', block_name_giv_block_on_dist)
   block_name_given_block_and_not_on = pymc3.Categorical.dist('block_name_given_block_and_not_on', block_name_giv_block_not_on_dist)
   block_name = pymc3.mixture('block_name' w=block_mixture_weights, comp_dists=[block_name_given_block_and_on, block_name_given_block_and_not_on, NA])

junpenglao · April 15, 2019, 8:15am

You are getting there! You should rewrite the following into a continuous variable, either use the parameters (i.e., pOn, pTri_given_not_on + on * tri_delta_on here) directly, or wrap it into a Beta distribution:

   on = pymc3.Bernoulli('on', pOn)
   triangle = pymc3.Bernoulli('triangle', pTri_given_not_on + on * tri_delta_on)
   block = pymc3.Bernoulli('block', pBlock_given_not_on + on* block_delta_on)

Into:

   on = pOn
   triangle = pTri_given_not_on + on * tri_delta_on
   block = pBlock_given_not_on + on* block_delta_on

dmenager · May 3, 2019, 11:20pm

@junpenglao, So, you’re saying I should do something like this?

   # Internal x,y,z position variable to be transformed
   pos = pymc3.Normal('pos', 0., 1., shape=3)

   # Internal NA random variable
   NA = pymc3.Deterministic('NA', NA_ENCODING)
   
   # On 
   pOn = pymc3.Beta('on', alpha=on_count, beta = schema_count - on_count);
   on = pymc3.Bernoulli('on', pOn)

   # Triangle
   triangle_mixture_weights = np.array([on, (1 - on)])
   tri_giv_on = pymc3.Bernoulli.dist('tri_giv_on', pTri_given_not_on + tri_delta_on)
   tri_giv_not_on = pymc3.Bernoulli.dist('tri_giv_not_on', pTri_given_not_on)
   triangle = pymc3.Mixture('triangle', w=triangle_mixture_weights, comp_dists=[tri_giv_on, tri_giv_not_on])
   
   triangle_name_mixture_weights = np.array([on * triangle, (1 - on) * triangle, (1 - on) * (1 - triangle)])   
   tri_name_given_tri_and_on = pymc3.Categorical.dist('tri_name_given_tri_and_on', tri_name_giv_tri_on_dist)
   tri_name_given_tri_and_not_on = pymc3.Categorical.dist('tri_name_given_tri_and_not_on', tri_name_giv_tri_not_on_dist)
   triangle_name = pymc3.Mixture('triangle_name', w=triangle_name_mixture_weights, comp_dists=[tri_name_given_tri_and_on, tri_name_given_tri_and_not_on, NA])

   x1 = pymc3.Deterministic('x1', NA_ENCODING + on * (-NA_ENCODING + pos[0] * std_x1_on + mu_x1_on) + \
                            (1 - on) * triangle * (-NA_ENCODING + pos[0] * std_x1_tri + mu_x_tri))
   y1 = pymc3.Deterministic('y1', NA_ENCODING + on * (-NA_ENCODING + pos[1] * std_y1_on + mu_y1_on) + \
                            (1 - on) * triangle * (-NA_ENCODING + pos[1] * std_y1_tri + mu_y_tri))
   z1 = pymc3.Deterministic('z1', NA_ENCODING + on * (-NA_ENCODING + pos[2] * std_z1_on + mu_z1_on) + \
                            (1 - on) * triangle * (-NA_ENCODING + pos[2] * std_z1_tri + mu_z_tri))

And similarly for the block?

junpenglao · May 4, 2019, 1:11am

Nope - I mean try to avoid something like on = pymc3.Bernoulli('on', pOn) as it is an unobserved discrete variable.

dmenager · May 4, 2019, 1:28am

Ah, Ok, I see… You are suggesting that I marginalize it out, right?

junpenglao · May 4, 2019, 3:33am

yep

dmenager · May 6, 2019, 11:15pm

So, I’m not sure what to do about marginalization. In my setting, I will want to infer the values of these hidden variables. If they’re marginalized out like that, I won’t be able to do this, right?

junpenglao · May 7, 2019, 1:44am

You can infer the continuous mixture weight instead - if you want explicit latent label you can sample from the mixture weight with a categorical, or applying argmax.

dmenager · May 7, 2019, 4:45am

If you have some resources you could point me to, I’d appreciate that. I’m not sure I follow 100% what you’re suggesting that I do.

Also, in my setting,it is possible to mark some of the hidden variables as observed. For example, someone might say that on is true, while being interested in the values for triangle and block.

junpenglao · May 7, 2019, 6:12pm

You have seen the answer in FAQ right? You can also have a look at this notebook: https://github.com/junpenglao/advance-bayesian-modelling-with-PyMC3/blob/master/Notebooks/Code10%20-%20Schizophrenic_case_study.ipynb

Hope it gives you some inspirations!

dmenager · May 8, 2019, 7:05pm

Thanks for the link!

For the most part I understood your example, but I couldn’t grasp what was happening when you wrote:

Z_latent = pm.Uniform('Z_latent', 0., 1., shape=(6, Nt))
Z = pm.Deterministic('Z',
                         pm.theanof.tt_rng().binomial(
                             n=1, p=Z_latent, size=(6, Nt)))

My guess is that you’re marginalizing Z, but it’s not clear to me. Z_latent will always be 1, and you’re feeding that into some theano binomial distribution parameterized by one trial and probability of success, p=Z_latent=1?

I looked up pm.theanof.tt_rng(), but wasn’t able to find clear information on it.

junpenglao · May 9, 2019, 1:27am

Yeah that example is a bit convoluted - it is more to show what it is possible but I would not recommend it - the other two ways of modeling it is probability what you should focus on

dmenager · May 14, 2019, 3:49pm

Hey, I think I’ve made a lot of progress on my problem. Do you have any examples, you can point me to about inferring the mixture weights?

Do you mean doing something like this:

#Mixture weights
pOns = pymc3.Dirichlet('pOns', numpy.array(on_alphas))

on = pymc3.Categorical('on' pOns)

But for the mixture model, I would parameterize it using pOns, rather than on? Like:

triangle = pymc3.Mixture('triangle', w = pOns, \
                            comp_dists=[tri_giv_not_on, tri_giv_on], \
                            testval=1, dtype="int64", observed=1)

Thanks!

junpenglao · May 14, 2019, 6:44pm

Yes! That’s in general the idea of marginalization.

As for inferencing mixture, there are a few posts on the discourse you can look at. You can start with the discussion here: Properly sampling mixture models

Topic		Replies	Views
PyMC3 Conditioning Random Variable on Multiple Discrete Parents Questions	12	2021	February 9, 2022
Trouble specificying X \| a, b, c, d ~ Categorical( . ) Questions	5	494	March 2, 2019
Handling Likelihood Computation with Matrix Priors version agnostic modeling	1	20	January 28, 2025
Bernoulli correlations Questions	3	399	January 25, 2021
Categorical vs Discrete Uniform Questions	2	1155	January 27, 2020

Condition Categorical Variable on Bernoulli Parents

Related topics