The results of a switch are observed directly (beginner)

I have just started learning pymc3 so I might be thinking about this completely the wrong way.

Assume that we observe a vector of 10 booleans.

The process of interest generates (observed) booleans with a Bernoulli distribution with a parameter theta1. So I define a Beta prior over theta1 and define a variable with length 10 that is a sample from Bernoulli(theta1).

However, this true sample is disturbed by sometimes switching the true data to 0, with a probability theta2. So I define a switch to 0 with a probability Bernoulli(theta2).

The switched values are the observed ones. I am not sure how to tell the model that I observed the switched variables, i.e. I am not sure how to fit the model to the observed data.

This is what I have for now, and I am kind of stuck:

# observed data (already switched)
observed_data = np.random.binomial(1, 0.5, size=10)

with pm.Model() as skeptic_model:
    # uniform probability of the bernoulli parameter
    true_model_prior = pm.Beta("true_model_prior", 1, 1)
    true_data = pm.Bernoulli("true_data", p=true_model_prior, shape=data.shape)
    disturbed_data = pm.math.switch(pm.Bernoulli("disturbed", 0.1), true_data, 0)

It would help to think of it in a generative way, so the first step is try to express the data generation process and generate some fake data.
The procedure of generating data sounds like you can frame it under coin flipping:

  1. first flip a coin (A) with p=\theta_2, if coin A comes up head, set observed=0
  2. if coin A comes up tail, flip a second coin B with p=\theta_1, set observed=outcome of coin B.

So a generation process to get sample would be:

theta1, theta2 = .7, .3
obs = []
for i in range(10):
    if np.random.binomial(1, theta2) == 1:
        obs.append(np.random.binomial(1, theta1))

To model it in pymc3, the most straightforward way is to have a latent variable to represent the unknown result of coin flip of coin A.

import theano.tensor as tt
with pm.Model() as m:
    theta_2 = pm.Uniform('theta_2', 0., 1.)
    coinA = pm.Bernoulli('latent', p=theta_2, shape=10)
    theta_1 = pm.Uniform('theta_1', 0., 1.)
    p = tt.stack([theta_1, 0.])
    observed = pm.Bernoulli('obs', p=p[coinA], observed=obs)

FYI: What we have here is a mixture model, or a zero inflated binomial. Rewriting the above model to a marginalized mixture model would be much better for inference. You can also have a look at related literature and examples.

1 Like