Sorry, my question wasn’t very precise. What I meant was a way to pass parameters like the learning rate to pm.sample
. Something analog to nuts_kwargs
/step_kwargs
.
And about the nans: Wouldn’t it be possible to always store the previous state before a step and then if we encounter a nan go back one step and decrease the learning_rate or so? I don’t really know the literature about those optimizers well, so I hope I’m not asking something stupid here.