Parameter Learning in Simple Bayesian Network

dreeves · March 24, 2018, 3:18pm

Hello,

Thank you very much for creating this library and sharing it with the community. I am attempting to use PyMC3 to learn the parameters of a simple bayesian network. The example is taken from this publication:

The problem is a coin flipping experiment. A fair coin is flipped to determine which of two (A and B, possibly biased) coins are to be flipped 10 times. The outcome of each of the 10 flips is observed but which coin (A or B) was chosen is unknown. The experiment is repeated 5 times and the authors use EM to estimate the bias in coin A and coin B based on the 50 observations (10 at a time).

I am attempting to use PyMC3 to do this estimation as an alternative to EM. I tried to piece together a model based on documentation and user questions. Based on an approach that I found in this gist

https://gist.github.com/tbsexton/1349864212b25cce91dbe5e336d794b4

I am able to get the expected results for a single round of 10 coin flips using the code below

import numpy as np
import theano
import pymc3 as pm    

with pm.Model() as model:
    X0 = pm.Bernoulli('X0', 0.5)
    alpha_ps = np.array([600,500])
    beta_ps = np.array([400,500])
    alphas = theano.shared(alpha_ps)  # make numpy-->theano
    betas = theano.shared(beta_ps)
    alpha_0 = alphas[X0]  # select the alpha value that "happened" thanks to parent (X0)
    beta_0 = betas[X0] # select the beta value that "happened" thanks to parent (X0)
    theta = pm.Beta('theta',alpha=alpha_0, beta=beta_0)
    X1 = pm.Binomial(
            'X1', 
            n=10,
            p=theta, 
            observed=np.array(
                            [5]
            )
    )

trace = pm.sample(100000, tune=10000)

print (sum(trace['theta'])/len(trace['theta']), 1-sum(trace['theta'])/len(trace['theta']))

Where I am running into trouble is getting this to work for all 5 rounds of coin flips. I thought something like the following would work

with pm.Model() as model:
    X0 = pm.Bernoulli('X0', 0.5, shape=(5,1))
    alpha_ps = np.array([600,500])
    beta_ps = np.array([400,500])
    alphas = theano.shared(alpha_ps)  # make numpy-->theano
    betas = theano.shared(beta_ps)
    alpha_0 = alphas[X0]  # select the alpha value that "happened" thanks to parent (X0)
    beta_0 = betas[X0] # select the beta value that "happened" thanks to parent (X0)
    theta = pm.Beta('theta',alpha=alpha_0, beta=beta_0, shape=(5,1))
    X1 = pm.Binomial(
                'X1', 
                n=10,
                p=theta,
                observed=np.array(
                                [5, 9, 8, 4, 7]
                ),
                shape=(5,1)
    )

    trace = pm.sample(100000, tune=10000)

However, I get the following error:

TypeError: Cannot convert Type TensorType(int64, matrix) (of Variable X0_shared__) into Type TensorType(int64, col). You can try to manually convert X0_shared__ into a TensorType(int64, col).

This seems to be the result of using the theano.shared approach for the dependence of the value of theta on X0 (the coin that was chosen to perform the 10 flip experiment).

Is there a way for me to use my current approach to estimate theta or is there (perhaps) a different approach which will accomplish this goal?

Any help you can provide would be greatly appreciated.

Thanks!

junpenglao · March 24, 2018, 5:15pm

You are on the right track - just need to be more carefull of the shape:

alpha_ps = np.array([600, 500])
beta_ps = np.array([400, 500])
alphas = theano.shared(alpha_ps)  # make numpy-->theano
betas = theano.shared(beta_ps)
with pm.Model() as model:
    X0 = pm.Bernoulli('X0', 0.5, shape=5)
    alpha_0 = alphas[X0]
    beta_0 = betas[X0]
    theta = pm.Beta('theta', alpha=alpha_0, beta=beta_0, shape=5)
    X1 = pm.Binomial(
        'X1',
        n=10,
        p=theta,
        observed=np.array([5, 9, 8, 4, 7])
    )

dreeves · March 24, 2018, 8:01pm

That works. Thanks!

dreeves · March 26, 2018, 1:35pm

Is it also possible to include alpha and beta in the sampling process using this setup (or something similar) by assigning uniform priors to each. My attempt to do this with the following code results in an error

with pm.Model() as model:
    X0 = pm.Bernoulli('X0', 0.5, shape=5)
    alpha_ps = np.array([pm.Uniform('alpha_c0', 1,100), pm.Uniform('alpha_c1', 1,100)])
    beta_ps = np.array([pm.Uniform('beta_c0', 1,100), pm.Uniform('beta_c1', 1,100)])
    alphas = theano.shared(alpha_ps) 
    betas = theano.shared(beta_ps)
    alpha_0 = alphas[X0]
    beta_0 = betas[X0] 
    theta = pm.Beta('theta',alpha=alpha_0, beta=beta_0, shape=5)
    X1 = pm.Binomial(
                'X1', 
                n=10,
                p=theta,
                observed=np.array(
                                [5, 9, 8, 4, 7]
                ),
                shape=5
    )

    trace = pm.sample(100000, tune=10000)

TypeError: The generic ‘SharedVariable’ object is not subscriptable. This shared variable contains a Numpy array with dtype: ‘object’. This data type is not currently recognized by Theano tensors: please cast your data into a supported numeric type if you need Theano tensor functionalities.

junpenglao · March 26, 2018, 1:53pm

Yes it is possible by specifying the shape of the priors:

with pm.Model() as model:
    X0 = pm.Bernoulli('X0', 0.5, shape=5)
    alphas = pm.Uniform('alphas', 1, 100, shape=2)
    betas = pm.Uniform('betas', 1, 100, shape=2)
    alpha_0 = alphas[X0]
    beta_0 = betas[X0]
    theta = pm.Beta('theta', alpha=alpha_0, beta=beta_0, shape=5)
    X1 = pm.Binomial(
        'X1',
        n=10,
        p=theta,
        observed=np.array(
            [5, 9, 8, 4, 7]
        )
    )

However, if you are sampling from this model it won’t work out of the box - as it suffers from label switching. For more information see “Identifying Bayesian Mixture Models in PyMC3” (port of the Stan case study)

dreeves · April 2, 2018, 2:42pm

This is very helpful. Thank you!

Topic		Replies	Views
Modeling multivariate distributions - pymc does poorly on highly peaked latents v3 modeling	0	321	July 3, 2023
Modeling multivariate distributions for bayesian optimal inference v3	4	519	June 19, 2023
Revisiting the coin-flipping problem Questions	6	3452	July 20, 2019
Pymc3 likelihood math with non-theano function Questions theano	4	1075	September 25, 2017
Help with pm.Potential Questions	10	1147	January 13, 2021

Parameter Learning in Simple Bayesian Network

Related topics