Parameter Learning in Simple Bayesian Network

Hello,

Thank you very much for creating this library and sharing it with the community. I am attempting to use PyMC3 to learn the parameters of a simple bayesian network. The example is taken from this publication:

The problem is a coin flipping experiment. A fair coin is flipped to determine which of two (A and B, possibly biased) coins are to be flipped 10 times. The outcome of each of the 10 flips is observed but which coin (A or B) was chosen is unknown. The experiment is repeated 5 times and the authors use EM to estimate the bias in coin A and coin B based on the 50 observations (10 at a time).

I am attempting to use PyMC3 to do this estimation as an alternative to EM. I tried to piece together a model based on documentation and user questions. Based on an approach that I found in this gist

https://gist.github.com/tbsexton/1349864212b25cce91dbe5e336d794b4

I am able to get the expected results for a single round of 10 coin flips using the code below

import numpy as np
import theano
import pymc3 as pm    

with pm.Model() as model:
    X0 = pm.Bernoulli('X0', 0.5)
    alpha_ps = np.array([600,500])
    beta_ps = np.array([400,500])
    alphas = theano.shared(alpha_ps)  # make numpy-->theano
    betas = theano.shared(beta_ps)
    alpha_0 = alphas[X0]  # select the alpha value that "happened" thanks to parent (X0)
    beta_0 = betas[X0] # select the beta value that "happened" thanks to parent (X0)
    theta = pm.Beta('theta',alpha=alpha_0, beta=beta_0)
    X1 = pm.Binomial(
            'X1', 
            n=10,
            p=theta, 
            observed=np.array(
                            [5]
            )
    )

trace = pm.sample(100000, tune=10000)

print (sum(trace['theta'])/len(trace['theta']), 1-sum(trace['theta'])/len(trace['theta']))

Where I am running into trouble is getting this to work for all 5 rounds of coin flips. I thought something like the following would work

with pm.Model() as model:
    X0 = pm.Bernoulli('X0', 0.5, shape=(5,1))
    alpha_ps = np.array([600,500])
    beta_ps = np.array([400,500])
    alphas = theano.shared(alpha_ps)  # make numpy-->theano
    betas = theano.shared(beta_ps)
    alpha_0 = alphas[X0]  # select the alpha value that "happened" thanks to parent (X0)
    beta_0 = betas[X0] # select the beta value that "happened" thanks to parent (X0)
    theta = pm.Beta('theta',alpha=alpha_0, beta=beta_0, shape=(5,1))
    X1 = pm.Binomial(
                'X1', 
                n=10,
                p=theta,
                observed=np.array(
                                [5, 9, 8, 4, 7]
                ),
                shape=(5,1)
    )

    trace = pm.sample(100000, tune=10000)

However, I get the following error:

TypeError: Cannot convert Type TensorType(int64, matrix) (of Variable X0_shared__) into Type TensorType(int64, col). You can try to manually convert X0_shared__ into a TensorType(int64, col).

This seems to be the result of using the theano.shared approach for the dependence of the value of theta on X0 (the coin that was chosen to perform the 10 flip experiment).

Is there a way for me to use my current approach to estimate theta or is there (perhaps) a different approach which will accomplish this goal?

Any help you can provide would be greatly appreciated.

Thanks!

You are on the right track - just need to be more carefull of the shape:

alpha_ps = np.array([600, 500])
beta_ps = np.array([400, 500])
alphas = theano.shared(alpha_ps)  # make numpy-->theano
betas = theano.shared(beta_ps)
with pm.Model() as model:
    X0 = pm.Bernoulli('X0', 0.5, shape=5)
    alpha_0 = alphas[X0]
    beta_0 = betas[X0]
    theta = pm.Beta('theta', alpha=alpha_0, beta=beta_0, shape=5)
    X1 = pm.Binomial(
        'X1',
        n=10,
        p=theta,
        observed=np.array([5, 9, 8, 4, 7])
    )

1 Like

That works. Thanks!

Is it also possible to include alpha and beta in the sampling process using this setup (or something similar) by assigning uniform priors to each. My attempt to do this with the following code results in an error

with pm.Model() as model:
    X0 = pm.Bernoulli('X0', 0.5, shape=5)
    alpha_ps = np.array([pm.Uniform('alpha_c0', 1,100), pm.Uniform('alpha_c1', 1,100)])
    beta_ps = np.array([pm.Uniform('beta_c0', 1,100), pm.Uniform('beta_c1', 1,100)])
    alphas = theano.shared(alpha_ps) 
    betas = theano.shared(beta_ps)
    alpha_0 = alphas[X0]
    beta_0 = betas[X0] 
    theta = pm.Beta('theta',alpha=alpha_0, beta=beta_0, shape=5)
    X1 = pm.Binomial(
                'X1', 
                n=10,
                p=theta,
                observed=np.array(
                                [5, 9, 8, 4, 7]
                ),
                shape=5
    )

    trace = pm.sample(100000, tune=10000)

TypeError: The generic ‘SharedVariable’ object is not subscriptable. This shared variable contains a Numpy array with dtype: ‘object’. This data type is not currently recognized by Theano tensors: please cast your data into a supported numeric type if you need Theano tensor functionalities.

Yes it is possible by specifying the shape of the priors:

with pm.Model() as model:
    X0 = pm.Bernoulli('X0', 0.5, shape=5)
    alphas = pm.Uniform('alphas', 1, 100, shape=2)
    betas = pm.Uniform('betas', 1, 100, shape=2)
    alpha_0 = alphas[X0]
    beta_0 = betas[X0]
    theta = pm.Beta('theta', alpha=alpha_0, beta=beta_0, shape=5)
    X1 = pm.Binomial(
        'X1',
        n=10,
        p=theta,
        observed=np.array(
            [5, 9, 8, 4, 7]
        )
    )

However, if you are sampling from this model it won’t work out of the box - as it suffers from label switching. For more information see “Identifying Bayesian Mixture Models in PyMC3” (port of the Stan case study)

1 Like

This is very helpful. Thank you!