I am currently using pymc4.
I have a model which is moderately complicated, and runs (in parallel) on different batches of data.
What I have seen - as expected - is that the NUTS sampler produces ‘good’ results (that is, if I generate synthetic data, then NUTS recovers the correct parameter distributions), and ADVI works less well in that respect.
For most of the datasets that are passed in, the NUTS sampler is fine, it is quite fast and produces sensible results. However, for about 1 case in 10 the NUTS sampler ‘hangs’ and does not start sampling at all.
I have tried different ways of initializing/configuring NUTS, and haven’t found anything that fixes this problem robustly yet.
My next thought is to set this up so that it samples with NUTS, and only if that fails, fall-back to an alternative implementation that uses ADVI. One way to do this is to have a time-out for the sampler, and kill the process if that is exceeded. I don’t like that solution, because I don’t necessarily know in advance the expected runtime for a new dataset, or the sampler might just be running slower than normal - and I could end up killing a number of samplers that are actually working.
I think a better solution would be to kill the NUTS sampler (and run ADVI) if after some time (say 10 minutes) the NUTS sampler has not yet started (i.e. progress bar still at zero). However, to do this requires having some way of interrogating a sampler while it is running to find out it’s status - using a callback of some sort, or perhaps a monitor thread in the same process.
Is such a thing possible ?