Hi, I am trying to adapt the example from the “Model Based Machine Learning Book”, chapter 3 (link1 link2), where outcomes are parametrized as resulting from a deterministic greater-than comparison:
The model works okayish for the observed outcome=1 (there are many divergences), but it fails altogether (“Bad energy”) if the outcome=0. Is there a smarter way to parametrize the model that would work with PyMC3?
You will need to set the testval by hand, as the starting value is invalid (most likely because Jperf==Fperf
is you left it unspecified), which makes observed=0 impossible and gives your a inf logp
Thanks for the reply.
I tried setting the testval to the observed value but it did not fix it. I also don’t see how the equality was left unspecified (it was zero in the case of ‘Jperf==Fperf’, no?), in any case I changed ‘>’ to ‘>=’. It still only works when a value of 1 is observed (which is the most likely outcome apriori), but not when a 0 is observed.
What I meant is something like:
Jperf = pm.Normal(..., testval=0)
Fperf = pm.Normal(..., testval=1)
So then observed=0, it wont be invalid because the p will also equal to 0 (as, if Jperf > = Fperf
is True, then p=1 and observed could not be 0)
However, a more general problem of your model is that there are better way to model latent strength - the comparison (or the tt.switch
) discretized the variable and usually a bad idea.
Thank you very much.
I had already figured out that I had to change the testval of the INPUT to the binomial. I ran a prior predictive check and estimated the mean of each INPUT ‘Perf’ distribution for either outcome (0 or 1), and introduced the mean as the testval. It works… with a lot of divergences… but it works.
However, now I have to set the value manually and compile the model for each outcome. Is there a way to do this dynamically after setting up the model, as when altering a shared variable?
Something like:
Also I realize that discretizing the data is probably not the best way to model latent strength, but I wanted to see how far I could go in replicating the models from the book in PyMC3.
… Bad news to this is that if you see a lot of divergence the model is NOT working…
I really think you should change your model parameterization. See eg: Trueskill model in pymc3 - #2 by junpenglao
Thanks for the input.
Those links in specific don’t really seem to help, but I will try to find out something that does. I saw someone suggesting reparameterizing it as a cumulative gaussian (on Stan), so maybe there is something in there: https://discourse.mc-stan.org/t/implementing-a-factor-graph-model-for-the-trueskill-algorithm/7810.
In any case, thanks for your help so far.
Yes you can also use a cumulative Gaussian in PyMC3:
with pm.Model() as m:
JSkill = pm.Normal('JSkill', 120., 40.)
FSkill = pm.Normal('FSkill', 100., 5.)
lg_p = pm.distributions.dist_math.normal_lccdf(JSkill - FSkill, pm.math.sqrt(50.), 0.)
JWin = pm.Bernoulli('JWin', pm.math.exp(lg_p), observed=1.)
trace = pm.sample()