Questions on dependently-sized variables (possible bug?)

Hi, asking for help around a crash I’m getting. I wrote a PyMC model which uses a derived quantity as the shape parameter of some variable. Later, I use the variable in question in some custom Op. There is some invariant that should hold given the way the shape is computed that appears not to hold in practice and makes my custom Op crash (at least I believe so, I’ve struggling to get to the actual error in pdb). A sketch of the code will make this clear:

with model:
    # this boolean array represents a subset of a certain set
    subset_mask = pm.Bernoulli("subset", p=p_constant, shape=shape_constant)
    subset_size = subset_mask.sum() # the cardinality of the subset
    # some weights associated to subset elements, plus two for reasons
    weights = pm.Uniform("weights", upper=constant_upper, lower=constant_lower, shape=subset_size+2)
    my_op_result = my_op(subset_mask, weights)
    # ...
    pm.sample()

As you can see, the way the variable weights is defined should ensure a certain invariant, namely weights.shape[0] == subset_mask.sum() + 2. However, when I actually run the model, this appears not to hold sometimes and makes my_op crash. Specifically, it seems that at the first or perhaps second iteration of each chain, my_op gets called with subset_mask not all zeros and weights of size 2. My guess is that subset_mask gets initialized to all zeros, weights of size two is computed accordingly, then subset_mask is updated and somehow either weights does not follow or it does but the old size is retained.

So my question is: is this sort of dependency supposed to be possible, and if yes, am I doing it wrong (perhaps I have to mark the dependency or the variable size in some way?) or am I hitting a bug in pymc/pytensor?

Edit: something I find quite confusing: pm.draw does not lead to errors, only pm.sample (I’ve been using step=pm.Metropolis())

Can you post the traceback? At the pytensor level, there is no requirement for static sizes between evaluations (only static dimension).

At the sampling level, RVs aren’t allowed to change sizes from draw to draw. My understanding is that is breaks some assumptions in NUTS, and people have cooked up alternative sampling schemes to accommodate the changing size of the underlying parameter space. Reversible-Jump MCMC is one, I think. We don’t have it in PyMC. We do have SMC, which I think can also handle it, subject to the usual SMC caveats (small number of parameters)

Edit: I see you’re doing MH sampling, so my above comments don’t necessarily apply. I’m not an expert on samplers. It might still be a hard posterior to sample, though.

Thanks, that answers my question I think! As in, this is not supposed to work. I guess I can try to go around it by drawing a constant-sized array and using masks somehow.

The traceback is huge but the relevant part is:

AssertionError:
Apply node that caused the error: GrammarSetOp{grammar_set_id=140260536408384, method='combinatorial'}(inarray1, prim_set_shared, Composite{((10.0 * sigmoid(i0)) + (0.1 * (1.0 - sigmoid(i0))))}.0)
Toposort index: 10
Inputs types: [TensorType(int64, shape=(19,)), TensorType(int64, shape=(39,)), TensorType(float64, shape=(None,))]
Inputs shapes: [(19,), (39,), (2,)]
Inputs strides: [(8,), (8,), (8,)]
Inputs values: ['not shown', 'not shown', array([5.05, 5.05])]
Outputs clients: [[ExpandDims{axis=1}(GrammarSetOp{grammar_set_id=140260536408384, method='combinatorial'}.0)], []]

It’s an assertion error because I added an assertion for the invariant in my custom op’s perform. The node is my custom op, I simplified it above but there are actually two Boolean arrays as inputs. As you can see the 3rd input is of size 2, and this is the case every time I get this error, but it should be bigger except in the specific cases where the first two inputs are all zeroes.

I tried sample_smc and it also failed due to the following line, so I think perhaps it supports it in theory but not in practice:

  File "/home/me/Dev/python/miniforge3/envs/pymc/lib/python3.13/site-packages/pymc/sampling/forward.py", line 347, in draw
    return [np.stack(v) for v in drawn_values]

Edit;: although yes, I am using Metropolis and not Nuts, but it seems it is making the same assumptions.

The huge traceback is all relevant, don’t feel like you have to truncate it.

The assert is checking that the weights are of a fixed size? Or is it being used to stick some values in a fixed length vector, then you’re checking the shape of that vector? The way it’s set up now, the shape of the weights will depend on the draw from subset_mask. I don’t know if that’s your intention.

If you can share the op, that would also be useful :slight_smile:

For example, this should run fine:

import pymc as pm
import pytensor.tensor as pt

shape_constant = 10

with pm.Model() as m:
    subset_mask = pm.Bernoulli("subset", p=0.5, shape=(shape_constant, ))
    subset_size = subset_mask.sum() # the cardinality of the subset
    # some weights associated to subset elements, plus two for reasons
    weights = pm.Uniform("weights", upper=10, lower=0, shape=subset_size+2)
    
    x = pt.zeros(shape_constant)[subset_mask].set(weights[1:-1])
    idata = pm.sample(step=pm.Metropolis())

It’s really too much context to share, the Op is a lot of numpy code involving a custom class. But I can say that the assert is simply checking the invariant I described. Before I had it, a call to np.take somewhere down in the op code would fail instead.

def perform(self, node, inputs, output_storage):
    # note: self.grammar_set.n_extra_dims is 2 in practice
    assert len(inputs[2]) == inputs[0].sum() + inputs[1].sum() + self.grammar_set.n_extra_dims

Yes it’s very much the intention, but in practice it appears to not hold, that’s the issue.

I ran it and it doesn’t, the size of weights is systematically 2 if p < 0.5 and 12 if p >= 0.5. It doesn’t crash because nothing depends on the shapes being correct.

I think you were right the first time! This is just not supported.

Here is the backtrace (for the first of three consecutive exceptions) if you insist:

---------------------------------------------------------------------------
RemoteTraceback                           Traceback (most recent call last)
RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/home/emile/Dev/python/miniforge3/envs/pymc/lib/python3.13/site-packages/pytensor/compile/function/types.py", line 1037, in __call__
    outputs = vm() if output_subset is None else vm(output_subset=output_subset)
              ~~^^
  File "/home/emile/Dev/python/miniforge3/envs/pymc/lib/python3.13/site-packages/pytensor/graph/op.py", line 531, in rval
    r = p(n, [x[0] for x in i], o)
  File "/home/emile/Ling/Postdoc-ZAS/Number (SM)/numeral-production/grammar.py", line 324, in perform
    assert len(inputs[2]) == inputs[0].sum() + inputs[1].sum() + self.grammar_set.n_extra_dims, "incoherent input sizes"
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: incoherent input sizes

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/emile/Dev/python/miniforge3/envs/pymc/lib/python3.13/site-packages/pymc/sampling/parallel.py", line 154, in run
    self._start_loop()
    ~~~~~~~~~~~~~~~~^^
  File "/home/emile/Dev/python/miniforge3/envs/pymc/lib/python3.13/site-packages/pymc/sampling/parallel.py", line 211, in _start_loop
    point, stats = self._step_method.step(self._point)
                   ~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^
  File "/home/emile/Dev/python/miniforge3/envs/pymc/lib/python3.13/site-packages/pymc/step_methods/compound.py", line 276, in step
    point, sts = method.step(point)
                 ~~~~~~~~~~~^^^^^^^
  File "/home/emile/Dev/python/miniforge3/envs/pymc/lib/python3.13/site-packages/pymc/step_methods/arraystep.py", line 116, in step
    apoint, stats = self.astep(q)
                    ~~~~~~~~~~^^^
  File "/home/emile/Dev/python/miniforge3/envs/pymc/lib/python3.13/site-packages/pymc/step_methods/metropolis.py", line 301, in astep
    accept_rate_i = self.delta_logp(q_temp, q0d)
  File "/home/emile/Dev/python/miniforge3/envs/pymc/lib/python3.13/site-packages/pytensor/compile/function/types.py", line 1047, in __call__
    raise_with_op(
    ~~~~~~~~~~~~~^
        self.maker.fgraph,
        ^^^^^^^^^^^^^^^^^^
    ...<2 lines>...
        storage_map=getattr(self.vm, "storage_map", None),
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/home/emile/Dev/python/miniforge3/envs/pymc/lib/python3.13/site-packages/pytensor/link/utils.py", line 526, in raise_with_op
    raise exc_value.with_traceback(exc_trace)
  File "/home/emile/Dev/python/miniforge3/envs/pymc/lib/python3.13/site-packages/pytensor/compile/function/types.py", line 1037, in __call__
    outputs = vm() if output_subset is None else vm(output_subset=output_subset)
              ~~^^
  File "/home/emile/Dev/python/miniforge3/envs/pymc/lib/python3.13/site-packages/pytensor/graph/op.py", line 531, in rval
    r = p(n, [x[0] for x in i], o)
  File "/home/emile/Ling/Postdoc-ZAS/Number (SM)/numeral-production/grammar.py", line 324, in perform
    assert len(inputs[2]) == inputs[0].sum() + inputs[1].sum() + self.grammar_set.n_extra_dims, "incoherent input sizes"
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: incoherent input sizes
Apply node that caused the error: GrammarSetOp{grammar_set_id=139887627014288, method='combinatorial'}(inarray1, prim_set_shared, Composite{((10.0 * sigmoid(i0)) + (0.1 * (1.0 - sigmoid(i0))))}.0)
Toposort index: 10
Inputs types: [TensorType(int64, shape=(19,)), TensorType(int64, shape=(39,)), TensorType(float64, shape=(None,))]
Inputs shapes: [(19,), (39,), (2,)]
Inputs strides: [(8,), (8,), (8,)]
Inputs values: ['not shown', 'not shown', array([5.05, 5.05])]
Outputs clients: [[ExpandDims{axis=1}(GrammarSetOp{grammar_set_id=139887627014288, method='combinatorial'}.0)], []]

Backtrace when the node is created (use PyTensor flag traceback__limit=N to make it longer):
  File "<ipython-input-3-d92932096aff>", line 1, in <module>
    get_ipython().run_line_magic('run', 'numeral-production/pymc_poc_inline.py data_ngram_full.csv -c en-2019 -m 100 -w 100 --start 1 --algo metropolis --opt-method combinatorial -d 100 -t 100')
  File "/home/emile/Dev/python/miniforge3/envs/pymc/lib/python3.13/site-packages/IPython/core/interactiveshell.py", line 2486, in run_line_magic
    result = fn(*args, **kwargs)
  File "/home/emile/Dev/python/miniforge3/envs/pymc/lib/python3.13/site-packages/IPython/core/magics/execution.py", line 860, in run
    run()
  File "/home/emile/Dev/python/miniforge3/envs/pymc/lib/python3.13/site-packages/IPython/core/magics/execution.py", line 845, in run
    runner(filename, prog_ns, prog_ns,
  File "/home/emile/Dev/python/miniforge3/envs/pymc/lib/python3.13/site-packages/IPython/core/pylabtools.py", line 230, in mpl_execfile
    safe_execfile(fname, *where, **kw)
  File "/home/emile/Dev/python/miniforge3/envs/pymc/lib/python3.13/site-packages/IPython/core/interactiveshell.py", line 2904, in safe_execfile
    py3compat.execfile(
  File "/home/emile/Dev/python/miniforge3/envs/pymc/lib/python3.13/site-packages/IPython/utils/py3compat.py", line 56, in execfile
    exec(compiler(f.read(), fname, "exec"), glob, loc)
  File "/home/emile/Ling/Postdoc-ZAS/Number (SM)/numeral-production/pymc_poc_inline.py", line 105, in <module>
    costs = pm.Deterministic("costs", grammar_set_op(constant_set, prim_set, cost_weight)[0])

HINT: Use the PyTensor flag `exception_verbosity=high` for a debug print-out and storage map footprint of this Apply node.
"""

Ah yeah that makes sense. I guess in principle nothing is wrong with it, but how to store the outputs, since everything needs to be a square array. I think this is simply not supported, unless you can hide it all inside a distribution that spits out the sparse weight vector with static shape

Yeah, the fact my op doesn’t fail with SMC, just the tracking code, suggests it actually works in principle. I would assume there is no deep reason it can’t work with Metropolis either. But at any rate I’m just going to find an equivalent model that doesn’t have variable-size parameters, it’s fine.

1 Like

pymc samplers don’t handle variables with dynamic sizes. PyMC calls model.initial_point() at the start of sampling and then assumes these shapes will hold for the interity of sampling.

It’s not just a dumb limitation, it’s hard to write variable size samplers. You need something like Reversible-jump Markov chain Monte Carlo - Wikipedia

Then you also need a good data structure to store the ragged arrays

Can you rewrite your model so the shape is constant (pad it to the same max size) but your likelihood is told to ignore the padded values?

tuning / convergence may still struggle but at least you’ll have a chance for more interesting fail

My model has the property that high values for the weights end up not affecting the likelihood so I did this:

weights = pm.Uniform(..., shape=shape_constant)
mask = pm.Bernoulli(..., shape=shape_constant)
actual_weights = weights * mask + (1 - mask) * big_number

which I should have done from the beginning since it’s cleaner anyway (I expect the weight variable for a given position to be more independent from the other parameters if written like this).