I’m going through the bayesian cognitive modeling book and one of the exercises gives the following latent mixture model with regards to students getting N questions right on an exam with 40 questions. Students getting in the 20 range are assumed to be guessing and the goal is to determine who is likely guessing and who studied.
scores = [ 21, 17, 21, 18, 22, 31, 31, 34, 34, 35, 35, 36, 39, 35]
with pm.Model() as model:
zi = pm.Bernoulli('zi', p=0.5, shape=len(scores))
phi = pm.Uniform("phi", 0.5, 1, shape=len(scores))
psi = 0.5
theta = pm.Deterministic('theta', pm.math.eq(zi, 1)*phi+pm.math.eq(zi, 0)*psi)
pm.Binomial('obs', p=theta, n=40, observed=scores)
traces = pm.sample(2000, tune=10000, cores=4)
I wanted to see what the difference between that model and one using pymc3’s mixture class so I came up with a different model:
with pm.Model() as mixture_model:
w = pm.Dirichlet('w', a=np.ones(2))
sp = pm.Uniform('sp', 0.5, 1, shape=len(scores))
dist1 = pm.Binomial.dist(p=sp, n=40, shape=len(scores))
dist2 = pm.Binomial.dist(p=[0.5]*len(scores), n=40, shape=len(scores))
mixt = pm.Mixture('mixt', w=w, comp_dists=[dist1, dist2], observed=scores)
traces = pm.sample(3000, tune=1000, cores=4)
-
Are these models essentially the same thing? The second model seems a bit limiting in that I can’t estimate the theta value for an individual user.
-
In the first model, I can answer questions like “What percent of the posterior distribution of theta for a given user is > 0.5”, which gives me a clue about which group the user belongs to. Is there a way to directly ask, what is the probability that a user belongs to group 1 vs group 2?
-
Does w in the second model represent what % of users belong to each group?
-
In the second model, how do I answer "What is the probability user1 belongs to group 1 or group 2?