Scan through an array of random variables Gamma distribution

mercer · May 24, 2024, 12:55am

Hi, i was getting a missinginputerror while trying to test a model that uses a nested scan and i found out that the error was caused by a gamma distribution. Basically i am trying to use scan to iterate through an array of random variables that are generated using the shape parameter in a truncated gamma distribution but I keep getting errors. In the model it was a missinginputerror but now i tried to isolate the problem and i coded a function just to test the gamma distribution in isolation like this:

import pymc as pm
import pytensor.tensor as pt
from pytensor import scan

def test_d_sigma():
    with pm.Model() as model:
        d_sigma = pm.Truncated('d_sigma', dist = pm.Gamma.dist(alpha = 2, beta = 1), lower = 1e-9, shape = 40)
        
        def main_participant_update(d_sigma_i, test_output_prev):
           test_output = d_sigma_i * 2
           return [test_output]
    
        test_output_init = pt.scalar()
        [test_output_seq], _ = scan(
            fn = main_participant_update,
            sequences = [d_sigma],
            outputs_info = [test_output_init],
        )
        trace = pm.sample(1000, cores = 1, chains = 4, tune = 2000, return_inferencedata=False)
        print(trace['d_sigma'])
        
test_d_sigma()

and the error i am getting here is:

Traceback (most recent call last):
  File "C:\Users\User\.conda\envs\pymc_env\Lib\site-packages\pytensor\tensor\subtensor.py", line 2867, in _get_vector_length_Subtensor
    arg_len = get_vector_length(var.owner.inputs[0])
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\User\.conda\envs\pymc_env\Lib\site-packages\pytensor\tensor\__init__.py", line 88, in get_vector_length
    return _get_vector_length(getattr(v.owner, "op", v), v)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\User\.conda\envs\pymc_env\Lib\functools.py", line 909, in wrapper
    return dispatch(args[0].__class__)(*args, **kw)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\User\.conda\envs\pymc_env\Lib\site-packages\pytensor\tensor\__init__.py", line 94, in _get_vector_length
    raise ValueError(f"Length of {var} cannot be determined")
ValueError: Length of Scan{scan_fn, while_loop=False, inplace=none}.0 cannot be determined

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "c:\Users\User\Documents\Fear Generalization\tempCodeRunnerFile.python", line 49, in <module>        
    test_d_sigma()
  File "c:\Users\User\Documents\Fear Generalization\tempCodeRunnerFile.python", line 41, in test_d_sigma    
    [test_output_seq], _ = scan(
    ^^^^^^^^^^^^^^^^^
  File "C:\Users\User\.conda\envs\pymc_env\Lib\site-packages\pytensor\tensor\variable.py", line 616, in __iter__
    for i in range(pt.basic.get_vector_length(self)):
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\User\.conda\envs\pymc_env\Lib\site-packages\pytensor\tensor\__init__.py", line 88, in get_vector_length
    return _get_vector_length(getattr(v.owner, "op", v), v)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\User\.conda\envs\pymc_env\Lib\functools.py", line 909, in wrapper
    return dispatch(args[0].__class__)(*args, **kw)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\User\.conda\envs\pymc_env\Lib\site-packages\pytensor\tensor\subtensor.py", line 2870, in _get_vector_length_Subtensor
    raise ValueError(f"Length of {var} cannot be determined")
ValueError: Length of Subtensor{start:}.0 cannot be determined

I also tried to just define the gamma distribution without the truncation, or to pass the distribution as a non sequence to do a more explicit indexing like this:

import pymc as pm
import pytensor.tensor as pt
from pytensor import scan

def test_d_sigma():
    with pm.Model() as model:
        d_sigma = pm.Truncated('d_sigma', dist = pm.Gamma.dist(alpha = 2, beta = 1), lower = 1e-9, shape = 40)
        
        def main_participant_update(idx, test_output_prev, d_sigma):
           test_output = d_sigma[idx] * 2
           return [test_output]
 
        test_output_init = pt.scalar()
        [test_output_seq], _ = scan(
            fn = main_participant_update,
            sequences = [pt.arange(40)],
            outputs_info = [test_output_init],
            non_sequences = [d_sigma]
        )
        trace = pm.sample(1000, cores = 1, chains = 4, tune = 2000, return_inferencedata=False)
        print(trace['d_sigma'])
        
test_d_sigma()

but i still get the same error.
Why is this happening? Any help would be much appreciated.

jessegrabowski · May 24, 2024, 3:57am

Looks like the problem is test_output_init = pt.scalar(). This is a symbolic tensor that has no source of data, so PyMC cannot know what value it should be when it runs your model. You should replace this with either 1) a random variable, or 2) an actual value wrapped in pt.as_tensor_variable

mercer · May 25, 2024, 5:04pm

I tried to replace the original test_output_init with test_output_init = pm.Normal('test_output_init', mu=0, sigma=1) and then with test_output_init = pt.as_tensor_variable(0.0, dtype='float64') but in both of the cases i received the same error of the orginal post. I dont really care how test_output_init ends up being initialized because its just a procedural value that i need to initialize for the sake of the scan operation but nothing seems to work.

jessegrabowski · May 25, 2024, 11:50pm

The problem was simpler; your scan returns only one output (because you have only one outputs info) so you don’t need the implicit tuple unpacking in the return assignment:

        test_output_seq, _ = scan(

Whether or not your care about the initial state value, you still need to initialize it somehow.

mercer · May 27, 2024, 10:07am

Thank you, doing this: test_output_seq = scan( solved the error in the function. I still get errors in the model but probably because im using nested scans with multiple outputs.

Topic		Replies	Views
MissingInputError in PyMC model with PyTensor scan v5	0	116	May 15, 2024
`MissingInputError` when sampling data v5 modeling	11	325	December 22, 2023
MissingInputError in PyMC model with Theano/PyTensor v5	4	654	May 15, 2023
MissingInputError: Undeclared input - Bimodal distribution Questions	4	522	September 7, 2021
Bayesian Neural Net with Beta distribution Questions	6	1072	July 3, 2017

Scan through an array of random variables Gamma distribution

Related topics