Reinfocement learning model - derivative of RV is zero

toby_wise · April 10, 2019, 5:04pm

Thanks both!

@chartl I think the problem with the first approach giving index errors is due to to alpha going out of bounds, which in turn leads to NaNs. This line

choice = prob.shape[0] - T.sum(rand <= cumsum, axis=0)

Then gives a choice of 4 because cumsum (which is now NaN) is never greater than the random number (so we get a shape of 4 minus 0).

I tried switching the Normal to a bounded Normal to prevent alpha going out of bounds using the following:

BoundedNormal = pm.Bound(pm.Normal, lower=-0.5, upper=0.5)
alpha_err = BoundedNormal('alpha_err', mu=1.0, sd=1, shape=(15,))

This doesn’t produce any index errors, however I still get the same mass matrix error.

I’ll have a go at unwrapping the scan loop - it seems to me like there must be a simpler solution though, especially as similar models seem to work fine (e.g. Modeling reinforcement learning of human participant using PyMC3).

@Gon_F My feeling is that there must be something wrong with the code - I’ve done a lot of playing around though and can’t seem to find anything!

Topic		Replies	Views
ValueError: Mass matrix contains zeros on the diagonal. The derivative of RV `@@@`.ravel()[0] is zero v3 modeling	0	584	September 13, 2022
"ValueError: Mass matrix contains zeros on the diagonal" error, how to debug? Questions	7	1599	March 7, 2020
Mass matrix contains zeros on the diagonal and The derivative of RV is zero Questions	0	447	March 3, 2021
"ValueError: Mass matrix contains zeros on the diagonal" and "pymc3 the derivative of rv is zero" Questions	1	698	October 18, 2021
Mass matrix contains zeros on the diagonal Questions	6	4302	May 4, 2020

Reinfocement learning model - derivative of RV is zero

Related topics