Sampling error when using complex tensor function as distribution parameter


#1

I have hit a wall trying to construct a particular model, and I wanted to ask for some help troubleshooting this. I am making a model which calculates probabilities from a normal CDF with mean/sd that are PyMC3 distributions. I transform those probabilities and try to use those as the parameters for an output distribution. Whenever I try to sample from that distribution I quickly receive the “ValueError: Mass matrix contains zeros on the diagonal.” error.

I simplified the model down as far as I could while still getting this error. I can sample through my function and generate what looks to be perfectly fine samples of the u/“spprop2” variable, but as soon as I try to use that as a parameter for a subsequent distribution this error occurs. Most of the hits for this error seem to be about overflows resulting from extreme priors. I don’t have any observedRVs in my model at this point, and everything up to the K distribution behaves like I would expect and the samples of u look perfectly fine and don’t seem like they would cause any overflows.

I would appreciate any feedback on this and ideas about how to resolve this error.


#2

gates = np.log10(np.array([0, 500, 3000, 7000, np.inf])) contains -inf and inf, would that be the problem?


#3

Removing those does resolve this issue, although I’m not sure why. The function f() does what I was hoping it would with infinite input there. The inf values in gates should disappear outside of the f() function, as they act to allow calculating a probability for all four splits of the input normal distributions across the entire real number range (the erf() handles them appropriately). If I make the values into very small and large numbers (even finfo(float64).eps and finfo(float64).max) it fixes this issue, so that is good.

Ultimately, I am not sure why this fixed the problem because the f() function is returning identical values, I’ll chalk it up to some Theano thing. Thank you for the pointer.


#4

Altought the function is returning same value, it can make a difference as it’s trying to take gradient of the function also. If some intermediate steps have inf I imagine it would give an error for NUTS (no error for theano necessary).