Thank you for your comments on improving the model.
I still don’t fully understand the cause of the NaN issue. It seems that training works when using PyMC’s default ADVI. Interestingly, the loss value in the progressbar appears as inf, but after training, az.summary shows that the training process acts regardless of accuracy.
I want to observe the process where ADVI samples from the unconstrained domain and computes the model’s log-likelihood in the early stages of training. However, I’m not sure which part of the code to look at. Does anyone know?