Variables with no variation during sampling

Here is code from the github site dedicated to translating code found in the book “Bayesian Cognitive Modeling” into PyMC. Here is problem 6.1 (Exam Scores) from Chapter 6 (Latent-mixture models):

import warnings
import arviz as az
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import pymc3 as pm
import theano.tensor as tt

from matplotlib import gridspec
from scipy import stats

%config InlineBackend.figure_format = 'retina'
warnings.simplefilter(action="ignore", category=FutureWarning)
RANDOM_SEED = 8927
np.random.seed(286)

k = np.array([21, 17, 21, 18, 22, 31, 31, 34, 34, 35, 35, 36, 39, 36, 35])
p = len(k)  # number of people
# p is 15
n = 40  # number of questions

with pm.Model() as model1:
    # group prior
    # Why aren't the zi sampled continuously during the MCMC? I do not understand. 
    zi = pm.Bernoulli("zi", p=0.5, shape=p)
    # accuracy prior
    phi = pm.Uniform("phi", upper=1, lower=0.5)
    psi = 0.5
    theta = pm.Deterministic("theta", phi * tt.eq(zi, 1) + psi * tt.eq(zi, 0))

    # observed
    ki = pm.Binomial("ki", p=theta, n=n, observed=k)

    trace1 = pm.sample()

ztrace = trace1["zi"]
print("Grouping", ztrace[-1, :])
print("Grouping", ztrace[ 5, :])

az.plot_trace(trace1, var_names=["zi", "phi"], compact=True);

If one prints out the trace for dthe different z random variables one finds that they are mostly constant. For example, some of the z's are actually constant, and some z's might flight 1-5 times over several iterations. I am trying to understand why this is the case. Any insights would be appreciated. Thanks.

1 Like

If you check the posterior, you should see that the zi parameter is mostly either 0 or 1:

In [24]: np.mean(trace1['zi'],axis=0)
Out[24]: 
array([0.     , 0.     , 0.     , 0.     , 0.     , 0.9932 , 0.99335,
       1.     , 0.99995, 1.     , 1.     , 1.     , 1.     , 1.     ,
       1.     ])

So you’re not seeing much switching behavior because there isn’t much switching behavior (i.e., there is quite a bit of certainty about the value of this parameter in the posterior). Is that not what you expected?