Pm.distributions.transforms

Can someone point me to a notebook or other doc that makes clear, with examples, how to use the objects defined in pm.distributions.transforms. Maybe I’m being dense, but I can’t find anything but the briefest snippets as though everyone already understands exactly what’s going on.

Opher

This is a minimal non-working example. It fails both on my local machine under windows and in colab:

import pymc as pm

with pm.Model() as model: # (coords=coords)
    beta = pm.Normal('beta', mu=1.0, sigma=1.0, shape=[6, 2], transform=pm.distributions.transforms.simplex)

with model:
    fit = pm.sample(cores=1)

Produces the error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/aesara/compile/function/types.py in __call__(self, *args, **kwargs)
    975                 self.vm()
--> 976                 if output_subset is None
    977                 else self.vm(output_subset=output_subset)

ValueError: Input dimension mismatch. One other input has shape[1] = 2, but input[5].shape[1] = 6.

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
9 frames
/usr/local/lib/python3.7/dist-packages/aesara/compile/function/types.py in __call__(self, *args, **kwargs)
    974             outputs = (
    975                 self.vm()
--> 976                 if output_subset is None
    977                 else self.vm(output_subset=output_subset)
    978             )

ValueError: Input dimension mismatch. One other input has shape[1] = 2, but input[5].shape[1] = 6.
Apply node that caused the error: Elemwise{Composite{((i0 + (i1 * sqr((i2 + i3))) + i4 + i5) - i6)}}[(0, 3)](TensorConstant{(1, 1) of ..5332046727}, TensorConstant{(1, 1) of -0.5}, TensorConstant{(1, 1) of -1.0}, beta_simplex___simplex, Elemwise{log1p,no_inplace}.0, Elemwise{Mul}[(0, 1)].0, Elemwise{Composite{(i0 * (i1 + log(i2)))}}[(0, 1)].0)
Toposort index: 27
Inputs types: [TensorType(float64, (1, 1)), TensorType(float64, (1, 1)), TensorType(float64, (1, 1)), TensorType(float64, (None, None)), TensorType(float64, (1, 1)), TensorType(float64, (1, None)), TensorType(float64, (1, None))]
Inputs shapes: [(1, 1), (1, 1), (1, 1), (6, 2), (1, 1), (1, 6), (1, 6)]
Inputs strides: [(8, 8), (8, 8), (8, 8), (16, 8), (8, 8), (48, 8), (48, 8)]
Inputs values: [array([[-0.91893853]]), array([[-0.5]]), array([[-1.]]), 'not shown', array([[0.69314718]]), 'not shown', 'not shown']
Outputs clients: [[Sum{acc_dtype=float64}(beta_simplex___logprob)]]

HINT: Re-running with most Aesara optimizations disabled could provide a back-trace showing when this node was created. This can be done by setting the Aesara flag 'optimizer=fast_compile'. If that does not work, Aesara optimizations can be disabled with 'optimizer=None'.
HINT: Use the Aesara flag `exception_verbosity=high` for a debug print-out and storage map footprint of this Apply node.

We should definitely add some documentation on how transforms work.

Haven’t had the time to look at your issue yet.

Thanks! I posted it twice, once as a main thread and here also as a reply.

I’m trying to figure out the problem by debugging into the PyMC code, but I’m not really knowledgable enough.

Opher

I believe I’ve found the bug. In the aeppl SimplexTransform class (pymc distributions.simplex is just a wrapper for this class), there is a forward function:

    def forward(self, value, *inputs):
        log_value = at.log(value)
        shift = at.sum(log_value, -1, keepdims=True) / value.shape[-1]
        return log_value[..., :-1] - shift

I believe this function fails on vectors because it assumes the last line assumes there is at least a two dimensional structure. That’s what I saw when testing in aesara:

x = at.dmatrix('x')
y = at.sum(x, -1, keepdims=True) / x.shape[-1]
z = x[..., :-1]
f = ae.function([x], y)
g = ae.function([x], z)

x_val = np.array([[1], [0.5], [0.5]])
print(f(x_val))
print(g(x_val))

Gives:

[[1. ]
 [0.5]
 [0.5]]
[]

I’m not sure how I should move this forward further.

Opher

It looks to me like pymc SumTo1 class has the same essential code, so if this is a bug, it’s now located completely within the pymc codebase, if that helps.

Opher

Shouldn’t your x_val have a shape of atleast 2 in the last dimension? Otherwise the simplex can’t work.

I agree that it doesn’t seem to work with only two columns but I don’t see why it shouldn’t. The two columns need to add to 1.

Is there another way to do it?

Opher

It should work with two columns. Your test case above had a single column only. You original snippet however had 2, so I don’t have a good guess as to why it fails.

Sorry. Good point. Then I’m not sure what is going wrong! I’ll try to make a minimal example.