Reusing a Minibatch object for several models

Hvass-Labs · August 13, 2020, 1:43pm

I have a dataset with nearly a million data-points consisting of multiple arrays with predictor variables. I use Minibatch and pm.fit to find the posterior because it would be too slow to just use pm.sample on the full dataset.

I am comparing several different models that use the same data. So I tried creating just a single Minibatch object for each of the data-arrays I am using. It works well for the first model, but when I try and fit the other models, it seems that the Minibatch objects somehow mess up the order of the data (or something), because it is unable to fit the models to the data, as it basically just fits the mean and std.dev. of the data. When I try and fit the first model again, it also fails the second time.

I have to create new Minibatch objects before fitting each of the models. Then the model-fitting works again.

I could not find any mentioning of this in the docs or on this forum. Please consider adding this information, that Minibatch objects will only work for a single model and have to be recreated before each model is fitted.

Thanks!

PS: I cannot add a link to the particular docs-page because of your spam-filter!

junpenglao · August 14, 2020, 8:15am

@ferrine could comment more but I am also under the impression that you can reuse the same minibatch object as they should be in sync.

Topic		Replies	Views
How to use pm.Data with pm.Minibatch? v5 modeling , sampling , prediction	5	102	February 12, 2025
Inference with multi-dimensional data and minibatches Questions	0	307	March 12, 2020
Minibatch not working v5 bug	11	351	October 2, 2024
Minibatch for MAP and/or wide models? Questions	5	852	August 31, 2017
How to make Minibatch for multi-dimensional data? Questions	10	2476	September 17, 2020

Reusing a Minibatch object for several models

Related topics