Looping through random variables

Hello,

I’m trying to follow Chapter 2 of Model Based Machine Learning http://mbmlbook.com/LearningSkills_Moving_to_real_data.html

It’s about assessing people’s skills through questions they answer. The generative model is such that there are 7 skills and 48 questions. Each question may test one or more skills. I initialize each skill as a Bernoulli (shape of 7, since there are 7 skills)

skills = pm.Bernoulli(‘skills’, p=0.5*np.ones(len(df_skills.columns)), shape=len(df_skills.columns))

Next, I want to loop through each question, checking which all skills are required to answer that question. I do that through

hasAllSkillsForQuestionP = np.ones(len(df_correct.columns))

for q in range(len(df_correct.columns)):
    
    skills_indexes_needed_for_this_q = np.where(df_skills_per_q.transpose()[q].tolist())[0]
    
    for skill in skills_indexes_needed_for_this_q:
        
        hasAllSkillsForQuestionP[q] = hasAllSkillsForQuestionP[q]*skills[skill]

And then I add unpredictability (that is, if a candidate has all skills for answering a question, probability of correct=0.9, if the candidate doesn’t have all skills, probability of correct = guessing = 0.2… it’s MCQ of 5 questions)

isCorrectQuestion = pm.Deterministic('isCorrectQuestion', pm.math.switch(hasAllSkillsForQuestionP, pm.Bernoulli(p=0.9), pm.Bernoulli(p=0.2)))

When I try compiling the code, I get this error:


ValueError Traceback (most recent call last)
in
15 for skill in skills_indexes_needed_for_this_q:
16
—> 17 hasAllSkillsForQuestionP[q] = hasAllSkillsForQuestionP[q]*skills[skill]
18
19 #hasAllSkillsForQuestion = pm.Bernoulli(‘hasAllSkillsForQuestion’, p=hasAllSkillsForQuestionP, shape=len(hasAllSkillsForQuestionP))

ValueError: setting an array element with a sequence.

Why is this happening? As per docs, random variables with shape support full indexing like

with model:
y = x[0] * x[1] # full indexing is supported

How do I fix this?

You cannot assign theano tensor to hasAllSkillsForQuestionP which is a numpy array. You can either create an empty theano tensor and assign value to it, or index to skills and reshape it.

2 Likes

Thanks. I changed

hasAllSkillsForQuestionP = np.ones(len(df_correct.columns))

to

hasAllSkillsForQuestionP = pm.Deterministic(‘hasAllSkillsForQuestionP’,theano.shared(np.ones((22, len(df_correct.columns)))))

and

hasAllSkillsForQuestionP[q] = hasAllSkillsForQuestionP[q]*skills[skill]

to

tt.set_subtensor(hasAllSkillsForQuestionP[:,q], hasAllSkillsForQuestionP[:,q]*skills[:,int(skill)])

where (tt is theano library) and additional dimension is what I added later for my model (not related to my question).

The model compiled successfully. Thanks for your help!

Glad that you get it works. Still you might want to avoid using for loop to improve performance.

Thanks. Yes, that’s what I ended up doing. Instead of looping, I did matrix multiplication (which had the same effect).

My updated model looks like this:

skills = pm.Bernoulli('skills', p=0.5*np.ones((22, 7)), shape=(22,7))

hasAllSkillsForQuestionP = tt.dot(skills, df_skills_per_q_t.T.values)    

hasAllSkillsForQuestion = pm.Bernoulli('hasAllSkillsForQuestion', p=hasAllSkillsForQuestionP, shape=(22,len(df_correct.columns)))

isCorrectQuestion = pm.Bernoulli('isCorrectQuestion', p=pm.math.switch(hasAllSkillsForQuestion, 0.9, 0.2), observed=df_correct, shape=(22, 48))

But now the problem has shifted to inference. When I run inference for 1000 steps, it selects BinaryGibbsMetropolis: [skills, hasAllSkillsForQuestion] and after 1000 steps, when I inspect the trace, it’s really bad. Most posteriors have NaN as n_eff (which I interpret as effective samples to predict the value). Although some have 3 or 4 n_eff.

Do you know what might be happening, and how do I do better inference?

In the book they use message passing algorithms for inference, but I understand that PyMC3 doesn’t implement it.

In your model you have a latent discrete variable, which is why sampling needs a BinaryGibbsMetropolis. In general, discrete latent variable are very difficult to inference, the general recommendation is that to treat it as a mixture model problem and rewrite it into a marginalized mixture model.

I think this post is the closest to your problem that you can start with:

See also:

It provides a way to marginalizing the latent discrete variables.

Thanks. I do not fully understand mixture models or how to construct marginalized models. Is there a tutorial that I should follow?

By the way, I end up changing Bernoulli

skills = pm.Bernoulli(‘skills’, p=0.5*np.ones((22, 7)), shape=(22,7))

To Beta distribution

skills = pm.Beta(‘skills’, alpha=2, beta=2, shape=(22,7))

And model is providing better estimates. I also felt perhaps I should have used beta in the first place as probability of skill could be between 0 or 1, rather than being binary.

1 Like