The results of a switch are observed directly (beginner)

juststarted · November 11, 2018, 9:03am

I have just started learning pymc3 so I might be thinking about this completely the wrong way.

Assume that we observe a vector of 10 booleans.

The process of interest generates (observed) booleans with a Bernoulli distribution with a parameter theta1. So I define a Beta prior over theta1 and define a variable with length 10 that is a sample from Bernoulli(theta1).

However, this true sample is disturbed by sometimes switching the true data to 0, with a probability theta2. So I define a switch to 0 with a probability Bernoulli(theta2).

The switched values are the observed ones. I am not sure how to tell the model that I observed the switched variables, i.e. I am not sure how to fit the model to the observed data.

This is what I have for now, and I am kind of stuck:

# observed data (already switched)
observed_data = np.random.binomial(1, 0.5, size=10)

with pm.Model() as skeptic_model:
    # uniform probability of the bernoulli parameter
    true_model_prior = pm.Beta("true_model_prior", 1, 1)
    true_data = pm.Bernoulli("true_data", p=true_model_prior, shape=data.shape)
    disturbed_data = pm.math.switch(pm.Bernoulli("disturbed", 0.1), true_data, 0)

junpenglao · November 11, 2018, 6:08pm

It would help to think of it in a generative way, so the first step is try to express the data generation process and generate some fake data.
The procedure of generating data sounds like you can frame it under coin flipping:

first flip a coin (A) with p=\theta_2, if coin A comes up head, set observed=0
if coin A comes up tail, flip a second coin B with p=\theta_1, set observed=outcome of coin B.

So a generation process to get sample would be:

theta1, theta2 = .7, .3
obs = []
for i in range(10):
    if np.random.binomial(1, theta2) == 1:
        obs.append(0)
    else:
        obs.append(np.random.binomial(1, theta1))

To model it in pymc3, the most straightforward way is to have a latent variable to represent the unknown result of coin flip of coin A.

import theano.tensor as tt
with pm.Model() as m:
    theta_2 = pm.Uniform('theta_2', 0., 1.)
    coinA = pm.Bernoulli('latent', p=theta_2, shape=10)
    theta_1 = pm.Uniform('theta_1', 0., 1.)
    p = tt.stack([theta_1, 0.])
    observed = pm.Bernoulli('obs', p=p[coinA], observed=obs)

FYI: What we have here is a mixture model, or a zero inflated binomial. Rewriting the above model to a marginalized mixture model would be much better for inference. You can also have a look at related literature and examples.

Topic		Replies	Views
Best way to model a observed variable that is a mixture of two distributions Questions	2	863	November 19, 2018
Combining models using switch modeling	5	801	June 8, 2023
Implementation Questions: Thompson Sampling and Switching Models Development	3	2242	December 30, 2017
Random variable as observation Questions	9	1444	January 20, 2023
Observed data in Bayesian networks Questions	13	2714	July 6, 2021

The results of a switch are observed directly (beginner)

Related topics