ARMA implementation using Numpyro

Hey there!

I’m new to pymc. Tried implementing an ARMA model for a 1 dimensional vector, following the great notebook by Jesse Grabowski.

After trying to run it without Numpyro Jax, it took too long (not sure it can even run) so I opted to using Numpyro in pymc.

I get a recurring error. I looked at the pytensor.scan documentation but couldn’t solve it. I will say that I haven’t worked with Numpyro before, so I might misunderstand how to use it.
I’d appreciate it if someone could help me understand where I went wrong and if my problem is not with intensive computation but with wrong implementation.

This is my code:

import pymc as pm
import pytensor.tensor as pt
from pytensor.scan import scan
from pytensor.compile.mode import get_mode
import numpy as np

with pm.Model() as model:
    # Non-zero mean
    mu = pm.Normal('mu', 0, 1, dtype='float32')
    
    # GARCH parameters
    omega = pm.Uniform('omega', 0, 1, dtype='float32')
    alpha = pm.Uniform('alpha', 0, 1, dtype='float32')
    beta = pm.Uniform('beta', 0, (1 - alpha), dtype='float32')
    
    # ARMA parameters
    theta = pm.Normal('theta', 0, 1, dtype='float32')
    rho = pm.HalfNormal('rho', 1, dtype='float32')
    
    # Initial values 
    sigma_sq_init = pm.Exponential('sigma_sq_init', 1, dtype='float32')
    epsilon_init = pm.Normal('epsilon_init', 0, 1, dtype='float32')
    y_init = pm.Normal('y_init', 0, 1, dtype='float32')
    
    # Data setup
    X_data = pm.Data('X_data', returns_intra)
    y_data = pm.Data('y_data', returns_intra)
    
    # Concatenate initial value to data for y_tm1
    X = pt.concatenate([y_init[None], X_data])   # Convert to float64
    
    def step(y_true, y_tm1, sigma_sq_tm1, epsilon_tm1, ω, α, β, μ, ρ, θ):
        # Predicted mean μ_t incorporating ARMA(1,1) dynamics
        mu_t = pt.cast(μ + θ * epsilon_tm1 + ρ * y_tm1, 'float32')
        
        # GARCH(1,1) process for σ2
        sigma_sq = pt.cast(ω + α * epsilon_tm1**2 + β * sigma_sq_tm1, 'float32')
        
        # Residuals (epsilon_t = y_t - mu_t)
        epsilon = pt.cast(y_true - mu_t, 'float32')

        return sigma_sq, epsilon, mu_t

    [sigmas, epsilons, y_hat], updates = pytensor.scan(
        fn=step,
        outputs_info=[sigma_sq_init, epsilon_init, None],
        sequences=[{'input': X, 'taps': [0, -1]}],
        non_sequences=[omega, alpha, beta, mu, rho, theta],
        strict=True,
        mode=get_mode("JAX")  # Adjust mode as needed
    )

    # Define the likelihood using y_hat and sigmas
    obs = pm.Normal('obs', mu=y_hat, sigma=pt.sqrt(sigmas), observed=y_data, dtype='float32')
    
    # Save the results for analysis (optional)
    pm.Deterministic('sigmas', sigmas)
    pm.Deterministic('epsilons', epsilons)
    pm.Deterministic('y_hat', y_hat)
    
    # Sample from the posterior
    idata = pm.sample(
        draws=1000, 
        chains=4,  
        cores=4, 
        init='jitter+adapt_diag_grad', 
        target_accept=0.95,
    )

And the error I got:

Traceback (most recent call last):
  File "c:\Users\maorb\anaconda3\envs\pymc_env\Lib\site-packages\pytensor\compile\function\types.py", line 959, in __call__
    self.vm()
  File "c:\Users\maorb\anaconda3\envs\pymc_env\Lib\site-packages\pytensor\scan\op.py", line 1649, in rval
    r = p(n, [x[0] for x in i], o)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Users\maorb\anaconda3\envs\pymc_env\Lib\site-packages\pytensor\scan\op.py", line 1577, in p
    t_fn, n_steps = scan_perform_ext.perform(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "pytensor\\scan\\scan_perform.pyx", line 397, in pytensor.scan.scan_perform.perform
AttributeError: 'ArrayImpl' object has no attribute 'data'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "c:\Users\maorb\anaconda3\envs\pymc_env\Lib\site-packages\pymc\sampling\parallel.py", line 128, in run
    self._start_loop()
  File "c:\Users\maorb\anaconda3\envs\pymc_env\Lib\site-packages\pymc\sampling\parallel.py", line 180, in _start_loop
    point, stats = self._step_method.step(self._point)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Users\maorb\anaconda3\envs\pymc_env\Lib\site-packages\pymc\step_methods\arraystep.py", line 173, in step
...
Inputs values: [array(3973, dtype=int64), 'not shown', 'not shown', 'not shown', 'not shown', 'not shown', 'not shown', 'not shown', array([ 0.        , -0.78269994], dtype=float32), array([ 0.        , -0.08729921], dtype=float32), array([ 0.       , -0.3879588], dtype=float32), array([ 0.        , -0.67018235], dtype=float32), array([0.        , 0.00013272], dtype=float32), array([0.        , 0.22382082], dtype=float32), array(3973, dtype=int64), array(3973, dtype=int64), array(0.45758057, dtype=float32), array(0.2256364, dtype=float32), array(1.4903326, dtype=float32), array(-0.386238, dtype=float32)]
Outputs clients: [[Subtensor{start:stop:step}(Scan{grad_of_scan_fn, while_loop=False, inplace=all}.0, ScalarFromTensor.0, ScalarFromTensor.0, -1)], [Subtensor{start:stop:step}(Scan{grad_of_scan_fn, while_loop=False, inplace=all}.1, ScalarFromTensor.0, ScalarFromTensor.0, -1)], [Subtensor{i}(Scan{grad_of_scan_fn, while_loop=False, inplace=all}.2, -1)], [Subtensor{i}(Scan{grad_of_scan_fn, while_loop=False, inplace=all}.3, -1)], [Subtensor{i}(Scan{grad_of_scan_fn, while_loop=False, inplace=all}.4, -1)], [Subtensor{i}(Scan{grad_of_scan_fn, while_loop=False, inplace=all}.5, -1)], [Subtensor{i}(Scan{grad_of_scan_fn, while_loop=False, inplace=all}.6, -1)], [Subtensor{i}(Scan{grad_of_scan_fn, while_loop=False, inplace=all}.7, -1)], [Subtensor{::step}(Scan{grad_of_scan_fn, while_loop=False, inplace=all}.8, -1)], [Subtensor{::step}(Scan{grad_of_scan_fn, while_loop=False, inplace=all}.9, -1)]]

HINT: Re-running with most PyTensor optimizations disabled could provide a back-trace showing when this node was created. This can be done by setting the PyTensor flag 'optimizer=fast_compile'. If that does not work, PyTensor optimizations can be disabled with 'optimizer=None'.
HINT: Use the PyTensor flag `exception_verbosity=high` for a debug print-out and storage map footprint of this Apply node.

Thanks!

Have you checked out the new tutorial notebook about building time series with pm.CustomDist? My ARIMA-GARCH notebook is somewhat out of date. The model as you’ve written it has no stochastic innovations at each time step, it is a deterministic trajectory with iid measurement error added. Basically you want this:

y_t = \rho y_{t-1} + \epsilon_t

Where y_t is describing your data. But what you wrote is:

\begin{align}x_t &= \rho x_{t-1} \\ y_t &= x_t + \epsilon_t \end{align}

Where again, y_t is the data you’re modeling.

Also you’re still implementing them model in pytensor, not numpyro. To use the numpyro MCMC sampler (which is written in JAX) to sample, you just need to do idata = pm.sample(nuts_sampler='numpyro'). There’s no other changes required in that respect.

I’ve checked out the new notebook and it looks great, pm.sample(nuts_sampler=‘numpyro’) solved my original problem.

Thanks!