Nested Parallel in PyMC3

Have you try sampling with core=1? This will disable the pymc3 parallel sampling so it does not conflict your custom forward computation (or at least surface the bug if there is any)