pm.Minibatch Doc string

I am (re)-learning to use ADVI with PYMC and was working through the example here : Introduction to Variational Inference with PyMC — PyMC example gallery
In that example, the authors show a rather extensive docstring for pm.Minibatch but the docstring for Minibatch now is just the anemic:

Get random slices from variables from the leading dimension.

    Parameters
    ----------
    variable: TensorVariable
    variables: TensorVariable
    batch_size: int

    Examples
    --------
    >>> data1 = np.random.randn(100, 10)
    >>> data2 = np.random.randn(100, 20)
    >>> mdata1, mdata2 = Minibatch(data1, data2, batch_size=10)

Is the original doc string incorrect? For example is this still correct : “Importantly, we need to make PyMC “aware” that a minibatch is being used in inference. Otherwise, we will get the wrong :math:logp for the model. the density of the model logp that is affected by Minibatch. See more in the examples below. To do so, we need to pass the total_size parameter to the observed node, which correctly scales the density of the model logp that is affected by Minibatch.”
If so is this documented now somewhere else?

Yes, you need to pass total_size, you should see a warning when you call pm.fit if you don’t.

I agree the docstring is pretty sad. PRs welcome :slight_smile:

1 Like