Multivariate Random Walk Regression

Caleb · December 7, 2023, 6:41pm

I’m trying to implement a correlated random walk over the slope and intercept parameters of a linear model. I’m basing it loosely off the multivariate random walk tutorial on the pymc site. However, I keep running into some errors when trying to sample the prior predictive and posterior distributions. I’m relatively new to PyMC, so any help is greatly appreciated. Here is some sample code that throws the errors:

#simple linear model
x = np.random.uniform(-1, 1, 100)
x.sort(axis=0)
y =  x * 10 + np.random.normal(0,.1,100)

The model I’m trying to fit looks like this:

with pm.Model() as mv_randomwalk:
   # x = pm.MutableData('x', x_t)
    
    # Sigma for mu:
    sigma = pm.Normal('sigma', 0, 1)
    
    # LKJCholeskyCov for the covariance between alpha and beta
    chol_alpha_beta, *_ = pm.LKJCholeskyCov(
        'chol_alpha_beta', n=2, eta=2, sd_dist=pm.HalfCauchy.dist(2.5), compute_corr=True
    )
        
    # Correlated random walks for alpha and beta
    alpha_beta = pm.MvGaussianRandomWalk(
        'alpha_beta', mu=np.zeros(2), chol=chol_alpha_beta, shape=(x.shape, 2),
    )
    
    #extract params
    alpha = pm.Deterministic('alpha', alpha_beta[:, 0])
    beta = pm.Deterministic('beta', alpha_beta[:, 1])
    print(x.ndim)

    #regression
    mu = beta * x + alpha

    #likelihood
    y_hat = pm.Normal('y_hat', mu, sigma=sigma, observed=y_t)

with mv_randomwalk:
    idata = pm.sample_prior_predictive()

with mv_randomwalk:
    idata.extend(pm.sample(nuts_sampler='numpyro'))

Here is the error I get when sampling the prior predictive:

ValueError: scale < 0
Apply node that caused the error: normal_rv{0, (0, 0), floatX, True}(RandomGeneratorSharedVariable(<Generator(PCG64) at 0x2374384A0>), [100], 11, Composite{((i0 * i1) + i2)}.0, sigma)
Toposort index: 14
Inputs types: [RandomGeneratorType, TensorType(int64, shape=(1,)), TensorType(int64, shape=()), TensorType(float64, shape=(100,)), TensorType(float64, shape=())]
Inputs shapes: ['No shapes', (1,), (), (100,), ()]
Inputs strides: ['No strides', (8,), (), (8,), ()]
Inputs values: [Generator(PCG64) at 0x2374384A0, array([100]), array(11), 'not shown', array(-0.28429503)]
Outputs clients: [['output'], ['output']]

HINT: Re-running with most PyTensor optimizations disabled could provide a back-trace showing when this node was created. This can be done by setting the PyTensor flag 'optimizer=fast_compile'. If that does not work, PyTensor optimizations can be disabled with 'optimizer=None'.
HINT: Use the PyTensor flag `exception_verbosity=high` for a debug print-out and storage map footprint of this Apply node.

And when I try to run the NUTS sampler, I get this dimensions-related error:

ValueError: Input dimension mismatch: (input[0].shape[1] = 3, input[1].shape[1] = 75)
Apply node that caused the error: Composite{switch(i4, (((-0.5 * sqr(((i0 - i1) / i2))) - 0.9189385332046727) - i3), -inf)}(<Matrix(float64, shape=(?, ?))>, Add.0, ExpandDims{axes=[0, 1]}.0, Log.0, Gt.0)
Toposort index: 23
Inputs types: [TensorType(float64, shape=(None, None)), TensorType(float64, shape=(1, 75)), TensorType(float64, shape=(1, 1)), TensorType(float64, shape=(1, 1)), TensorType(bool, shape=(1, 1))]
Inputs shapes: [(75, 3), (1, 75), (1, 1), (1, 1), (1, 1)]
Inputs strides: [(24, 8), (600, 8), (8, 8), (8, 8), (1, 1)]
Inputs values: ['not shown', 'not shown', array([[0.77063881]]), array([[-0.26053548]]), array([[ True]])]
Outputs clients: [[Sum{axes=None}(sigma > 0)]]

HINT: Re-running with most PyTensor optimizations disabled could provide a back-trace showing when this node was created. This can be done by setting the PyTensor flag 'optimizer=fast_compile'. If that does not work, PyTensor optimizations can be disabled with 'optimizer=None'.
HINT: Use the PyTensor flag `exception_verbosity=high` for a debug print-out and storage map footprint of this Apply node.

I’m particularly confused about where this input[0].shape[1] = 3 is coming from. It seems like the result of MV_GaussianRandomwalk must have a shape that is different from what I would expect, (something like (100, 1, 3)). I would expect a shape of (100,) corresponding to one of the two (100, 2) random walk dimensions specified. When I naively try to inspect the shape of beta: print(beta.shape) I get Shape.0.

I have a feeling I have two related issues here- first, the model is probably mis-specified in some way. I suspect I am not indexing the alpha and beta randomwalks correctly. Secondly, as a result, the x array and the alpha and beta variables are not shaped the same. I’m relatively new to PyMC (and PyTensor), and could use some handholding here- how can I inspect the output of PyMC variables to see things like size, shape etc? What else am I doing wrong here? Thanks for the great package and for any help.

ricardoV94 · December 8, 2023, 9:43am

You can call .eval() on that shape variable or any.model variable to get it’s concrete value

Caleb · December 8, 2023, 1:35pm

Great, thanks for the response! Now I’m getting somewhere. When I evaluate each of the variables, here is what I am getting:

#Print the shape of each evaluated variable:
[print(f"variable: {var}'s shape = {var.eval().shape}") for var in [sigma, chol_alpha_beta, alpha_beta, alpha, beta, mu, y_hat]]

Which yields this:

variable: sigma's shape = ()
variable: AdvancedSetSubtensor.0's shape = (2, 2)
variable: alpha_beta's shape = (100, 2)
variable: alpha's shape = (100,)
variable: beta's shape = (100,)
variable: Add.0's shape = (100,)

Already, it seems odd that sigma would have a shape of (). Why would that be? Shouldn’t it be (1,)? And it looks like alpha_beta is (100, 2) as expected, so where is this (100, 2, 3) matrix coming from?

But when it tries to evaluate the likelihood, I get the same ValueError as earlier (Including the whole traceback, just in case someone has some insight for me):

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
File ~/Desktop/research_river_imagery/research-river-imagery/venv/lib/python3.11/site-packages/pytensor/compile/function/types.py:970, in Function.__call__(self, *args, **kwargs)
    968 try:
    969     outputs = (
--> 970         self.vm()
    971         if output_subset is None
    972         else self.vm(output_subset=output_subset)
    973     )
    974 except Exception:

File ~/Desktop/research_river_imagery/research-river-imagery/venv/lib/python3.11/site-packages/pytensor/graph/op.py:549, in Op.make_py_thunk.<locals>.rval(p, i, o, n, params)
    545 @is_thunk_type
    546 def rval(
    547     p=p, i=node_input_storage, o=node_output_storage, n=node, params=None
    548 ):
--> 549     r = p(n, [x[0] for x in i], o)
    550     for o in node.outputs:

File ~/Desktop/research_river_imagery/research-river-imagery/venv/lib/python3.11/site-packages/pytensor/tensor/random/op.py:340, in RandomVariable.perform(self, node, inputs, outputs)
    338 rng_var_out[0] = rng
--> 340 smpl_val = self.rng_fn(rng, *(args + [size]))
    342 if (
    343     not isinstance(smpl_val, np.ndarray)
    344     or str(smpl_val.dtype) != out_var.type.dtype
    345 ):

File ~/Desktop/research_river_imagery/research-river-imagery/venv/lib/python3.11/site-packages/pytensor/tensor/random/op.py:129, in RandomVariable.rng_fn(self, rng, *args, **kwargs)
    128 """Sample a numeric random variate."""
--> 129 return getattr(rng, self.name)(*args, **kwargs)

File numpy/random/_generator.pyx:1220, in numpy.random._generator.Generator.normal()

File _common.pyx:600, in numpy.random._common.cont()

File _common.pyx:505, in numpy.random._common.cont_broadcast_2()

File _common.pyx:384, in numpy.random._common.check_array_constraint()

ValueError: scale < 0

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
Cell In[1098], line 75
     72 #likelihood
     73 y_hat = pm.Normal('y_hat', mu, sigma=sigma, observed=y_t)
---> 75 [print(f"variable: {var}'s shape = {var.eval().shape}") for var in [sigma, chol_alpha_beta, alpha_beta, alpha, beta, mu, y_hat]]

Cell In[1098], line 75, in <listcomp>(.0)
     72 #likelihood
     73 y_hat = pm.Normal('y_hat', mu, sigma=sigma, observed=y_t)
---> 75 [print(f"variable: {var}'s shape = {var.eval().shape}") for var in [sigma, chol_alpha_beta, alpha_beta, alpha, beta, mu, y_hat]]

File ~/Desktop/research_river_imagery/research-river-imagery/venv/lib/python3.11/site-packages/pytensor/graph/basic.py:630, in Variable.eval(self, inputs_to_values)
    627     self._fn_cache[inputs] = function(inputs, self)
    628 args = [inputs_to_values[param] for param in inputs]
--> 630 rval = self._fn_cache[inputs](*args)
    632 return rval

File ~/Desktop/research_river_imagery/research-river-imagery/venv/lib/python3.11/site-packages/pytensor/compile/function/types.py:983, in Function.__call__(self, *args, **kwargs)
    981     if hasattr(self.vm, "thunks"):
    982         thunk = self.vm.thunks[self.vm.position_of_error]
--> 983     raise_with_op(
    984         self.maker.fgraph,
    985         node=self.vm.nodes[self.vm.position_of_error],
    986         thunk=thunk,
    987         storage_map=getattr(self.vm, "storage_map", None),
    988     )
    989 else:
    990     # old-style linkers raise their own exceptions
    991     raise

File ~/Desktop/research_river_imagery/research-river-imagery/venv/lib/python3.11/site-packages/pytensor/link/utils.py:531, in raise_with_op(fgraph, node, thunk, exc_info, storage_map)
    526     warnings.warn(
    527         f"{exc_type} error does not allow us to add an extra error message"
    528     )
    529     # Some exception need extra parameter in inputs. So forget the
    530     # extra long error message in that case.
--> 531 raise exc_value.with_traceback(exc_trace)

File ~/Desktop/research_river_imagery/research-river-imagery/venv/lib/python3.11/site-packages/pytensor/compile/function/types.py:970, in Function.__call__(self, *args, **kwargs)
    967 t0_fn = time.perf_counter()
    968 try:
    969     outputs = (
--> 970         self.vm()
    971         if output_subset is None
    972         else self.vm(output_subset=output_subset)
    973     )
    974 except Exception:
    975     restore_defaults()

File ~/Desktop/research_river_imagery/research-river-imagery/venv/lib/python3.11/site-packages/pytensor/graph/op.py:549, in Op.make_py_thunk.<locals>.rval(p, i, o, n, params)
    545 @is_thunk_type
    546 def rval(
    547     p=p, i=node_input_storage, o=node_output_storage, n=node, params=None
    548 ):
--> 549     r = p(n, [x[0] for x in i], o)
    550     for o in node.outputs:
    551         compute_map[o][0] = True

File ~/Desktop/research_river_imagery/research-river-imagery/venv/lib/python3.11/site-packages/pytensor/tensor/random/op.py:340, in RandomVariable.perform(self, node, inputs, outputs)
    336     rng = copy(rng)
    338 rng_var_out[0] = rng
--> 340 smpl_val = self.rng_fn(rng, *(args + [size]))
    342 if (
    343     not isinstance(smpl_val, np.ndarray)
    344     or str(smpl_val.dtype) != out_var.type.dtype
    345 ):
    346     smpl_val = _asarray(smpl_val, dtype=out_var.type.dtype)

File ~/Desktop/research_river_imagery/research-river-imagery/venv/lib/python3.11/site-packages/pytensor/tensor/random/op.py:129, in RandomVariable.rng_fn(self, rng, *args, **kwargs)
    127 def rng_fn(self, rng, *args, **kwargs) -> Union[int, float, np.ndarray]:
    128     """Sample a numeric random variate."""
--> 129     return getattr(rng, self.name)(*args, **kwargs)

File numpy/random/_generator.pyx:1220, in numpy.random._generator.Generator.normal()

File _common.pyx:600, in numpy.random._common.cont()

File _common.pyx:505, in numpy.random._common.cont_broadcast_2()

File _common.pyx:384, in numpy.random._common.check_array_constraint()

ValueError: scale < 0
Apply node that caused the error: normal_rv{0, (0, 0), floatX, False}(RandomGeneratorSharedVariable(<Generator(PCG64) at 0x218320D60>), [100], 11, Composite{((i0 * i1) + i2)}.0, sigma)
Toposort index: 14
Inputs types: [RandomGeneratorType, TensorType(int64, shape=(1,)), TensorType(int64, shape=()), TensorType(float64, shape=(100,)), TensorType(float64, shape=())]
Inputs shapes: ['No shapes', (1,), (), (100,), ()]
Inputs strides: ['No strides', (8,), (), (8,), ()]
Inputs values: [Generator(PCG64) at 0x218320D60, array([100]), array(11), 'not shown', array(-2.43463247)]
Outputs clients: [[], ['output']]

EDIT: Re-running that list comprehension sometimes successfully evaluates y_hat: y_hat.eval().shape = (100,). And sometimes, it prints this before throwing the ValueError, with some variables having a name of None. What the hell is going on?

variable: sigma's shape = ()
variable: None's shape = (2, 2)
variable: alpha_beta's shape = (100, 2)
variable: alpha's shape = (100,)
variable: beta's shape = (100,)
variable: None's shape = (100,)

Caleb · December 8, 2023, 2:09pm

Well, it looks like this was a simple mistake in my priors…

sigma = pm.Normal(...)

obviously should have been:

sigma = pm.HalfNormal(...)

Thanks for the help though!

Topic		Replies	Views
Multivariate Random Walk with missing values Questions	2	721	September 10, 2019
Multivariate Linear Regression with correlated input variables Questions	7	2352	June 9, 2021
Multivariate linear regression with different prior distributions for coefficients v3 modeling	0	519	May 7, 2022
Rolling regression with multivariate stored in a pandas dataframe v5 modeling	0	599	September 28, 2022
Covariance matrix inference v5 modeling	5	902	May 17, 2023

Multivariate Random Walk Regression

Related Topics