I have a dataset with nearly a million data-points consisting of multiple arrays with predictor variables. I use Minibatch
and pm.fit
to find the posterior because it would be too slow to just use pm.sample
on the full dataset.
I am comparing several different models that use the same data. So I tried creating just a single Minibatch
object for each of the data-arrays I am using. It works well for the first model, but when I try and fit the other models, it seems that the Minibatch objects somehow mess up the order of the data (or something), because it is unable to fit the models to the data, as it basically just fits the mean and std.dev. of the data. When I try and fit the first model again, it also fails the second time.
I have to create new Minibatch objects before fitting each of the models. Then the model-fitting works again.
I could not find any mentioning of this in the docs or on this forum. Please consider adding this information, that Minibatch
objects will only work for a single model and have to be recreated before each model is fitted.
Thanks!
PS: I cannot add a link to the particular docs-page because of your spam-filter!