Help Needed: Kalman Filter with PyMC and pytensor.tensor.KalmanFilter Error

Hi everyone,

I’m reaching out to the community for some help with a roadblock I’ve encountered in my code involving PyMC and Kalman Filters.

Project Goal:

I’m working on a project that utilizes PyMC for Bayesian inference with time-varying parameters (TVP). Within this project, I’ve defined a function called fit_tvp_pvar that incorporates Kalman Filter imputation for handling missing values.

Errors:

  1. Missing Kalman Filter:
  • When attempting to leverage the Kalman Filter for imputation, I stumble upon this error:module 'pytensor.tensor' has no attribute 'KalmanFilter'This suggests that pytensor.tensor might not have a built-in Kalman Filter implementation.
  1. PyMC Model Definition/Sampling Error:
  • After successfully imputing missing values using an alternative method (e.g., iterative imputation), the code throws an error during PyMC model definition or sampling:ERROR (pytensor.graph.rewriting.basic): Rewrite failure due to: constant_foldingUnfortunately, this error message isn’t very specific, making troubleshooting challenging.

Code Snippet:

Here’s a relevant part of the code demonstrating the kalman_imputation function and the error related to the missing Kalman Filter attribute:

I first run this code using numby in the Kalman filter

def fit_tvp_pvar(self, num_iterations=10000, burn=5000, tune=5000, cores=1, delay=1):
print(“Starting fit_tvp_pvar method”)

    imputation_methods = ['kalman', 'iterative', 'knn']
    imputed = False
    
    for method in imputation_methods:
        try:
            print(f"Attempting {method} imputation")
            # Imputation process ...

            if method == 'kalman':
                data = self.kalman_imputation(self.data[self.vars].values.astype('float32'))
            elif method == 'iterative':
                imputer = IterativeImputer(max_iter=100, random_state=self.seed)
                data = imputer.fit_transform(self.data[self.vars].values.astype('float32'))
            elif method == 'knn':
                imputer = KNNImputer()
                data = imputer.fit_transform(self.data[self.vars].values.astype('float32'))

            data_tensor = pt.as_tensor_variable(data.astype('float32'))

            # Checking for inf or nan values ...

            if not pt.any(pt.isnan(data_tensor)).eval():
                imputed = True
                self.n_vars = data_tensor.shape[1].eval()
                print(f"Imputation successful with {method} method.")
                break
            else:
                print(f"Inf or nan values found after {method} imputation")
                # Handle inf or nan values ...

            time.sleep(delay)

        except Exception as e:
            print(f"{method} imputation failed: {e}")
            print(f"Error occurred at line {sys.exc_info()[2].tb_lineno} in fit_tvp_pvar")
            print(f"Error details: {traceback.format_exc()}")
            print("Trying the next method...")

    if not imputed:
        raise ValueError("All imputation methods failed.")

    try:
        print(f"Starting PyMC model definition")
        with pm.Model() as tvp_pvar_model:
            # Priors
            sd_dist = pm.HalfNormal('sd_dist', sigma=1.0)
            log_sd_vals = pm.Normal("log_sd_vals", mu=0, sigma=1, shape=(self.n_vars,))
            exp_log_sd_vals = pt.exp(log_sd_vals)
            sd_vals = pm.Deterministic("sd_vals", exp_log_sd_vals)
            alphas = pm.Normal("alphas", mu=0, sigma=1, shape=(self.n_obs, self.n_vars))

            # State transitions using loop
            state_vars = []
            for t in range(self.n_obs):
                prev_state = state_vars[t - 1] if t > 0 else pt.zeros((1, self.p * self.n_vars))
                shocks = pm.Normal(f"shocks_t{t}", mu=0.0, sigma=sd_vals, shape=(self.n_vars,))
                new_state = pt.zeros_like(prev_state)

                for lag in range(self.p):
                    is_not_finite = pt.or_(pt.isinf(alphas[t - lag - 1]), pt.isnan(shocks[t - lag]))
                    new_state += pt.dot(pt.switch(~is_not_finite, alphas[t - lag - 1], 0.0), shocks[t - lag])

                state_vars.append(new_state)

                # Likelihood
                etas = pm.Normal(f"etas_t{t}", 
                                 mu=pt.dot(state_vars[t], pt.repeat(pt.arange(self.p)[::-1], self.n_obs, axis=0)),
                                 sigma=sd_vals, 
                                 observed=data_tensor[t])

            print(f"Starting MCMC sampling")
            # MCMC sampling
            trace = pm.sample(draws=num_iterations, tune=tune, chains=cores)
            self.trace = trace

    except Exception:
        print("Exception occurred during PyMC model definition or sampling:")
        exc_type, exc_value, exc_traceback = sys.exc_info()
        print(f"Error occurred at line {exc_traceback.tb_lineno} in fit_tvp_pvar")
        print(''.join(traceback.format_exception(exc_type, exc_value, exc_traceback)))

def kalman_imputation(self, data):
    """
    Impute missing values using the Kalman Filter.
    """
    print("Starting Kalman imputation")
    print("Note: Using NumPy for Kalman imputation as it operates on NumPy arrays, not PyTensor tensors")
    
    try:
        data = np.where(np.isinf(data), np.nan, data)
        initial_state_mean = np.nanmean(data, axis=0)
        initial_state_mean = np.where(np.isnan(initial_state_mean), 0, initial_state_mean)
        
        kf = KalmanFilter(initial_state_mean=initial_state_mean, n_dim_obs=data.shape[1])
        masked_data = np.ma.masked_invalid(data)
        
        imputed_data, _ = kf.em(masked_data, n_iter=10).smooth(masked_data)
        
        if isinstance(imputed_data, np.ma.MaskedArray):
            filled_data = imputed_data.filled(fill_value=np.nanmean(imputed_data))
        else:
            filled_data = imputed_data
        
        return filled_data

    except Exception as e:
        print(f"Error in kalman_imputation: {e}")
        print(f"Error occurred at line {sys.exc_info()[2].tb_lineno} in kalman_imputation")
        print(f"Error details: {traceback.format_exc()}")
        raise

with error

Starting fit_tvp_pvar method
Attempting kalman imputation
Starting Kalman imputation
Note: Using NumPy for Kalman imputation as it operates on NumPy arrays, not PyTensor tensors
Imputation successful with kalman method.
Starting PyMC model definition
Starting MCMC sampling
ERROR (pytensor.graph.rewriting.basic): Rewrite failure due to: constant_folding
ERROR (pytensor.graph.rewriting.basic): node: Assert{msg=PyTensor Assert failed!}(0.0, False)
ERROR (pytensor.graph.rewriting.basic): TRACEBACK:
ERROR (pytensor.graph.rewriting.basic): Traceback (most recent call last):
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/graph/rewriting/basic.py”, line 1919, in process_node
replacements = node_rewriter.transform(fgraph, node)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/graph/rewriting/basic.py”, line 1081, in transform
return self.fn(fgraph, node)
^^^^^^^^^^^^^^^^^^^^^
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/tensor/rewriting/basic.py”, line 1128, in constant_folding
required = thunk()
^^^^^^^
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/link/c/op.py”, line 91, in rval
thunk()
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/link/c/basic.py”, line 1771, in call
raise exc_value.with_traceback(exc_trace)
AssertionError: PyTensor Assert failed!

Exception occurred during PyMC model definition or sampling:
Error occurred at line 107 in fit_tvp_pvar
Traceback (most recent call last):
File “/var/folders/p_/4shh5wgx7vn_sp4_77tp6c7m0000gn/T/ipykernel_11911/1276847003.py”, line 107, in fit_tvp_pvar
trace = pm.sample(draws=num_iterations, tune=tune, chains=cores)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pymc/sampling/mcmc.py”, line 684, in sample
step = assign_step_methods(model, step, methods=pm.STEP_METHODS, step_kwargs=kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pymc/sampling/mcmc.py”, line 212, in assign_step_methods
model_logp = model.logp()
^^^^^^^^^^^^
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pymc/model/core.py”, line 725, in logp
rv_logps = transformed_conditional_logp(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pymc/logprob/basic.py”, line 611, in transformed_conditional_logp
temp_logp_terms = conditional_logp(
^^^^^^^^^^^^^^^^^
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pymc/logprob/basic.py”, line 479, in conditional_logp
fgraph, rv_values, _ = construct_ir_fgraph(rv_values, ir_rewriter=ir_rewriter)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pymc/logprob/rewriting.py”, line 475, in construct_ir_fgraph
ir_rewriter.rewrite(fgraph)
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/graph/rewriting/basic.py”, line 121, in rewrite
return self.apply(fgraph, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/graph/rewriting/basic.py”, line 291, in apply
sub_prof = rewriter.apply(fgraph)
^^^^^^^^^^^^^^^^^^^^^^
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/graph/rewriting/basic.py”, line 291, in apply
sub_prof = rewriter.apply(fgraph)
^^^^^^^^^^^^^^^^^^^^^^
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/graph/rewriting/basic.py”, line 2457, in apply
sub_prof = grewrite.apply(fgraph)
^^^^^^^^^^^^^^^^^^^^^^
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/graph/rewriting/basic.py”, line 2037, in apply
nb += self.process_node(fgraph, node)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/graph/rewriting/basic.py”, line 1922, in process_node
self.failure_callback(
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/graph/rewriting/basic.py”, line 1777, in warn_inplace
return cls.warn(exc, nav, repl_pairs, node_rewriter, node)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/graph/rewriting/basic.py”, line 1765, in warn
raise exc
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/graph/rewriting/basic.py”, line 1919, in process_node
replacements = node_rewriter.transform(fgraph, node)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/graph/rewriting/basic.py”, line 1081, in transform
return self.fn(fgraph, node)
^^^^^^^^^^^^^^^^^^^^^
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/tensor/rewriting/basic.py”, line 1128, in constant_folding
required = thunk()
^^^^^^^
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/link/c/op.py”, line 91, in rval
thunk()
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/link/c/basic.py”, line 1771, in call
raise exc_value.with_traceback(exc_trace)
AssertionError: PyTensor Assert failed!

Then, I used everywhere Pytensor

def fit_tvp_pvar(self, num_iterations=10000, burn=5000, tune=5000, cores=1, delay=1):
    print("Starting fit_tvp_pvar method")

    imputation_methods = ['kalman', 'iterative', 'knn']
    imputed = False

    for method in imputation_methods:
        try:
            print(f"Attempting {method} imputation")
            # Imputation process ...

            if method == 'kalman':
                data = self.kalman_imputation(self.data[self.vars].values.astype('float32'))
            elif method == 'iterative':
                imputer = IterativeImputer(max_iter=100, random_state=self.seed)
                data = imputer.fit_transform(self.data[self.vars].values.astype('float32'))
            elif method == 'knn':
                imputer = KNNImputer()
                data = imputer.fit_transform(self.data[self.vars].values.astype('float32'))

            data_tensor = pt.as_tensor_variable(data.astype('float32'))

            # Checking for inf or nan values ...
            if not pt.any(pt.isnan(data_tensor)).eval():
                imputed = True
                self.n_vars = data_tensor.shape[1].eval()
                print(f"Imputation successful with {method} method.")
                break
            else:
                print(f"Inf or nan values found after {method} imputation")
                # Handle inf or nan values ...

            time.sleep(delay)

        except Exception as e:
            print(f"{method} imputation failed: {e}")
            print(f"Error occurred at line {sys.exc_info()[2].tb_lineno} in fit_tvp_pvar")
            print(f"Error details: {traceback.format_exc()}")
            print("Trying the next method...")

    if not imputed:
        raise ValueError("All imputation methods failed.")

    try:
        print(f"Starting PyMC model definition")
        with pm.Model() as tvp_pvar_model:
            # Priors
            sd_dist = pm.HalfNormal('sd_dist', sigma=1.0)
            log_sd_vals = pm.Normal("log_sd_vals", mu=0, sigma=1, shape=(self.n_vars,))
            exp_log_sd_vals = pt.exp(log_sd_vals)
            sd_vals = pm.Deterministic("sd_vals", exp_log_sd_vals)
            alphas = pm.Normal("alphas", mu=0, sigma=1, shape=(self.n_obs, self.n_vars))

            # State transitions using loop
            state_vars = []
            for t in range(self.n_obs):
                prev_state = state_vars[t - 1] if t > 0 else pt.zeros((1, self.p * self.n_vars))
                shocks = pm.Normal(f"shocks_t{t}", mu=0.0, sigma=sd_vals, shape=(self.n_vars,))
                new_state = pt.zeros_like(prev_state)

                for lag in range(self.p):
                    is_not_finite = pt.or_(pt.isinf(alphas[t - lag - 1]), pt.isnan(shocks[t - lag]))
                    new_state += pt.dot(pt.switch(~is_not_finite, alphas[t - lag - 1], 0.0), shocks[t - lag])

                state_vars.append(new_state)

                # Likelihood
                etas = pm.Normal(f"etas_t{t}", 
                                 mu=pt.dot(state_vars[t], pt.repeat(pt.arange(self.p)[::-1], self.n_obs, axis=0)),
                                 sigma=sd_vals, 
                                 observed=data_tensor[t])

            print(f"Starting MCMC sampling")
            # MCMC sampling
            trace = pm.sample(draws=num_iterations, tune=tune, chains=cores)
            self.trace = trace

    except Exception:
        print("Exception occurred during PyMC model definition or sampling:")
        exc_type, exc_value, exc_traceback = sys.exc_info()
        print(f"Error occurred at line {exc_traceback.tb_lineno} in fit_tvp_pvar")
        print(''.join(traceback.format_exception(exc_type, exc_value, exc_traceback)))


def kalman_imputation(self, data):
    """
    Impute missing values using the Kalman Filter with PyTensor.
    """
    print("Starting Kalman imputation with PyTensor")
    
    try:
        # Convert data to PyTensor tensor
        data_tensor = pt.as_tensor_variable(data.astype('float32'))
        
        # Initialize Kalman Filter parameters (example, actual implementation may vary)
        initial_state_mean = pt.mean(data_tensor, axis=0)
        initial_state_mean = pt.where(pt.isnan(initial_state_mean), 0, initial_state_mean)

        # Example KalmanFilter class from PyTensor (hypothetical)
        kf = pt.KalmanFilter(initial_state_mean=initial_state_mean, n_dim_obs=data_tensor.shape[1])
        
        # Perform Kalman smoothing
        imputed_data, _ = kf.em(data_tensor, n_iter=10).smooth(data_tensor)
        
        # Convert back to numpy array if necessary
        filled_data = imputed_data.eval()  # Assuming PyTensor tensor to numpy array conversion
        
        return filled_data

    except Exception as e:
        print(f"Error in kalman_imputation: {e}")
        raise

Getting

Starting fit_tvp_pvar method
Attempting kalman imputation
Starting Kalman imputation with PyTensor
Error in kalman_imputation: module ‘pytensor.tensor’ has no attribute ‘KalmanFilter’
kalman imputation failed: module ‘pytensor.tensor’ has no attribute ‘KalmanFilter’
Error occurred at line 44 in fit_tvp_pvar
Error details: Traceback (most recent call last):
File “/var/folders/p_/4shh5wgx7vn_sp4_77tp6c7m0000gn/T/ipykernel_12988/1786618341.py”, line 44, in fit_tvp_pvar
data = self.kalman_imputation(self.data[self.vars].values.astype(‘float32’))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/var/folders/p_/4shh5wgx7vn_sp4_77tp6c7m0000gn/T/ipykernel_12988/1786618341.py”, line 131, in kalman_imputation
kf = pt.KalmanFilter(initial_state_mean=initial_state_mean, n_dim_obs=data_tensor.shape[1])
^^^^^^^^^^^^^^^
AttributeError: module ‘pytensor.tensor’ has no attribute ‘KalmanFilter’

Trying the next method…
Attempting iterative imputation
Imputation successful with iterative method.
Starting PyMC model definition
Starting MCMC sampling
ERROR (pytensor.graph.rewriting.basic): Rewrite failure due to: constant_folding
ERROR (pytensor.graph.rewriting.basic): node: Assert{msg=PyTensor Assert failed!}(0.0, False)
ERROR (pytensor.graph.rewriting.basic): TRACEBACK:
ERROR (pytensor.graph.rewriting.basic): Traceback (most recent call last):
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/graph/rewriting/basic.py”, line 1919, in process_node
replacements = node_rewriter.transform(fgraph, node)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/graph/rewriting/basic.py”, line 1081, in transform
return self.fn(fgraph, node)
^^^^^^^^^^^^^^^^^^^^^
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/tensor/rewriting/basic.py”, line 1128, in constant_folding
required = thunk()
^^^^^^^
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/link/c/op.py”, line 91, in rval
thunk()
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/link/c/basic.py”, line 1771, in call
raise exc_value.with_traceback(exc_trace)
AssertionError: PyTensor Assert failed!

Exception occurred during PyMC model definition or sampling:
Error occurred at line 106 in fit_tvp_pvar
Traceback (most recent call last):
File “/var/folders/p_/4shh5wgx7vn_sp4_77tp6c7m0000gn/T/ipykernel_12988/1786618341.py”, line 106, in fit_tvp_pvar
trace = pm.sample(draws=num_iterations, tune=tune, chains=cores)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pymc/sampling/mcmc.py”, line 684, in sample
step = assign_step_methods(model, step, methods=pm.STEP_METHODS, step_kwargs=kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pymc/sampling/mcmc.py”, line 212, in assign_step_methods
model_logp = model.logp()
^^^^^^^^^^^^
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pymc/model/core.py”, line 725, in logp
rv_logps = transformed_conditional_logp(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pymc/logprob/basic.py”, line 611, in transformed_conditional_logp
temp_logp_terms = conditional_logp(
^^^^^^^^^^^^^^^^^
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pymc/logprob/basic.py”, line 479, in conditional_logp
fgraph, rv_values, _ = construct_ir_fgraph(rv_values, ir_rewriter=ir_rewriter)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pymc/logprob/rewriting.py”, line 475, in construct_ir_fgraph
ir_rewriter.rewrite(fgraph)
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/graph/rewriting/basic.py”, line 121, in rewrite
return self.apply(fgraph, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/graph/rewriting/basic.py”, line 291, in apply
sub_prof = rewriter.apply(fgraph)
^^^^^^^^^^^^^^^^^^^^^^
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/graph/rewriting/basic.py”, line 291, in apply
sub_prof = rewriter.apply(fgraph)
^^^^^^^^^^^^^^^^^^^^^^
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/graph/rewriting/basic.py”, line 2457, in apply
sub_prof = grewrite.apply(fgraph)
^^^^^^^^^^^^^
^^^^^^^^^ File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/graph/rewriting/basic.py”, line 2037, in apply nb += self.process_node(fgraph, node) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/graph/rewriting/basic.py”, line 1922, in process_node self.failure_callback( File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/graph/rewriting/basic.py”, line 1777, in warn_inplace return cls.warn(exc, nav, repl_pairs, node_rewriter, node) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/graph/rewriting/basic.py”, line 1765, in warn raise exc File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/graph/rewriting/basic.py”, line 1919, in process_node replacements = node_rewriter.transform(fgraph, node) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/graph/rewriting/basic.py”, line 1081, in transform return self.fn(fgraph, node) ^^^^^^^^^^^^^^^^^^^^^ File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/tensor/rewriting/basic.py”, line 1128, in constant_folding required = thunk() ^^^^^^^ File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/link/c/op.py”, line 91, in rval thunk() File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/link/c/basic.py”, line 1771, in call raise exc_value.with_traceback(exc_trace) AssertionError: PyTensor Assert failed!

Questions:

  1. Kalman Filter with pytensor:
  • Does pytensor.tensor offer a built-in Kalman Filter implementation?
  • If not, what alternative libraries or approaches could I employ for Kalman Filter imputation within this context?
  1. PyMC Model Definition/Sampling Error:
  • The pytensor.graph.rewriting.basic error message lacks clarity. Could anyone offer insights or suggestions for troubleshooting this error related to PyMC model definition or sampling?

Additional Information:

  • Libraries used
    • pymc v 5
    • pytensor 2.23.0
      (Please advise if there’s a specific version recommended for Kalman Filter functionality)

I would greatly appreciate any guidance or suggestions you can offer to help me resolve these errors and successfully execute the fit_tvp_pvar function.

Thank you in advance for your time and assistance!

Sincerely,

Dimitri

Hi Dimitri,

There are Kalman Filters implemented in pymc-experimental, you can take a look at the docs here. If you just want to fit a TP-VAR, you can do it with the VARMAX model. Docs here, example notebook here. Just a heads up that there are some known bugs in the main branch I’m working on squashing, but if you’re willing to be a bit of a guinea pig I’d be happy to try to help.

That said, there are many ways to write down a VAR in PyMC. The TP part will just come from putting time-varying priors on the coefficient matrix, e.g. using a GaussianRandomWalk (or MvGaussianRandomWalk). After you do this it would be trivial to implement without having to go through the hassle of using a full statespace setup. For example, you can implement it has a 2d convolution, or as a simple regression over lagged data.

It’s not really clear to me what your objective is with the Kalman Filter code you posted. Do you have missing data you want to interpolate, or is that just an algorithm you’re familiar with for fitting these kinds of models? In principal you don’t need it; PyMC can derive the log likelihood of a sequence of observations without using the KF recursions.

2 Likes

If you don’t have exogenous data, it’s not clear to me where the panel part is coming in. But that’s a semantic quibble. Have you looked at this article on TVP-VAR in statsmodels? The pymc-experimental statespace module is written to follow the statsmodel API very closely, so it would be quite easy to use their TVP-VAR class to implement the same thing in PyMC. You can follow this tutorial on making custom statespace models for guidance.

The statespace module handles data interpolation automatically, so that’s no problem. I will warn you that high-dimensional VAR models are going to struggle to sample, because the probability of drawing stationary parameters goes to zero quite quickly as the coefficient matrix grows. There are strategies to handle this, but none implemented and ready to use.

1 Like

Hi Jesse,
Thank you for the valuable suggestions on my previous posts! I’m writing to delve deeper into the setup of a state-space Panel VAR with a well-defined likelihood function, where time variation enters as a latent unit root.

Model Specifications:

  • Endogenous Variables: All variables in my panel are endogenous (no exogenous variables).
  • Latent Unit Root: The disturbance term in my model exhibits a latent unit root.
  • Stationary Data: I’ve confirmed stationarity of all variables in the panel using appropriate unit root tests.
  • High Dimensions: I have several endogenous variables in the model (about 30)

Challenges:

  1. Missing Data Handling: My data has gaps, spanning from post-war times to the present day. I’ve addressed some missing values by removing rows for countries that didn’t exist during certain periods (e.g., Baltic Countries). However, I’m hesitant to use methods like filling forward/backward or the most recent observation for non-existent countries. I find it unrealistic even for the rest of the sample.
  2. Non-Converging Code: I’ve attempted to implement the model using various approaches:
  • varmx in Statsmodels (failed to converge)
  • Several different codes implementations (no successful results)

Alternative Approaches:

I’m now exploring various approaches to implement the model, aligning to your suggestion, including:

  • 2D Convolution: Treating the lag structure as a 2D convolution.
  • Simple Lagged Regression: Regressing the dependent variable on lagged independent variables.
  • Custom State-Space Model: Building a custom state-space model to account for the specific characteristics of my data.

I’ve written these three different codes attempting to implement the model, but none of them have converged after running for three days. They seem to be stuck and not making any progress.

If I am taking much of your time, I am happy to share the last three code versions I wrote confidentially and privately with you or for anyone willing to help diagnose the convergence issues.

Ultimate Goal:

My ultimate goal is to obtain impulse response functions and potentially estimate a FEVD using either Bayesian or frequentist methods (flexible on the method). For the missing data, I’ve experimented with Kalman filtering, but I’m open to alternative techniques.

I am aware that PyMC treats missing data implicitly through priors. However, I have not been successful even with that approach

Request for Assistance:

  • Missing Data Techniques: Suitable methods for handling missing data in a Panel VAR with a latent unit root in disturbances, considering the limitations mentioned above (no filling forward/backward or using most recent observations for non-existent countries).
  • Debugging Non-Converging Code: Potential reasons why my code implementations might not be converging and possible troubleshooting steps.

Multicollinearity Note:

While I previously mentioned multicollinearity between economic variables and indicators, some of these specific indicator measurements might not be a major concern due to their inherent relationship in time series data (one appears, the other is missing).

Specific Questions:

  • Techniques for managing multicollinearity in a TVP-PVAR AR(1) framework when dropping/combining variables is undesirable.
  • Robust methods for identifying the most relevant indicators while considering multicollinearity and the need to retain all variables.
  • Alternative methods for interpreting separate regression results when collinearity is unavoidable.

My ultimate goal is to get Impulse Response Functions IRFs and FEVD

Thanks again for your time and assistance!

Dimitri

Can you write down what model you want as a system of equations?

Dear everyone,

I have followed closely the instructions from @jessegrabowski (thanks again!) and I have written a code

To sum up, I’m working on a Bayesian analysis of time-varying relationships between macroeconomic variables across multiple countries using a TVP-PVAR model implemented in PyMC. I’m encountering convergence issues during MCMC sampling and would appreciate some guidance from the community.

Model Description:

My model employs a state-space framework to represent the time-varying nature of the parameters. This framework involves:

  • Observation Equation: Relates observed data (denoted as y_t for time t) to a time-varying state vector (alpha_t) and an error term (varepsilon_t).
  • State Evolution Equation: Describes how the state vector (alpha_t) evolves over time, often depending on the previous state (alpha_(t-1)) and potentially a random component (eta_t).

Key Features:

  • The model incorporates time-varying parameters within the state vector (alpha_t) to capture dynamic relationships between economic variables.
  • Stochastic volatility is included, meaning the variance of the error term (varepsilon_t) can fluctuate over time.

General Model Equation:

To provide a general idea of the model structure about the specific variables and functional forms, here’s a simplified equation:

y_t = Z_t * alpha_t + epsilon_t
  • y_t represents the observed vector of economic indicators at time t.
  • Z_t denotes a design matrix that incorporates relevant information.
  • alpha_t represents the time-varying state vector capturing the dynamic relationships.
  • epsilon_t represents the error term at time t.

Code Snippet (Relevant Part):
Here’s a relevant code snippet that demonstrates the general structure of how I define the time-varying state vector and its priors:

import pymc as pm
import numpy as np

# ... other relevant libraries

def process_country(country, filtered_data, vars):
  # ... (existing code for data preparation)

  with pm.Model() as model:
    # Priors 
    alpha_0 = pm.MvNormal('alpha_0', mu=pt.zeros(N), cov=pt.eye(N) * 10)
    beta_0 = pm.MvNormal('beta_0', mu=pt.zeros(N), cov=pt.eye(N) * 10)
    sigma_alpha = pm.HalfCauchy('sigma_alpha', beta=1)
    sigma_beta = pm.HalfCauchy('sigma_beta', beta=1)

    # State equations 
    alpha = pm.MvNormal('alpha', mu=alpha_0, cov=sigma_alpha**2 * pt.eye(N), shape=(T_country, N))
    beta = pm.MvNormal('beta', mu=beta_0, cov=sigma_beta**2 * pt.eye(N), shape=(T_country, N))

    # Measurement equation  
    sigma_y = pm.HalfCauchy('sigma_y', beta=1)
    Y_obs = pm.MvNormal('Y_obs', mu=alpha + beta, cov=sigma_y**2 * pt.eye(N), observed=Y_country)

    # ... (existing code for sampling and diagnostics)

  return trace, model

Convergence Issues:

I’m experiencing poor convergence during MCMC sampling. Here are the specific issues:

  • Gelman-Rubin statistic (r_hat) for some variables exceeds 1.1, indicating inadequate convergence.
  • The sampler encounters divergences during the tuning phase.

The exact error is:

There were xxx divergences after tuning. Increase target_accept or reparameterize.
The rhat statistic is larger than 1.01 for some parameters. This indicates problems during sampling. See [1903.08008] Rank-normalization, folding, and localization: An improved $\widehat{R}$ for assessing convergence of MCMC for details
The effective sample size per chain is smaller than 100 for some parameters. A higher number is needed for reliable rhat and ess computation. See [1903.08008] Rank-normalization, folding, and localization: An improved $\widehat{R}$ for assessing convergence of MCMC for details

I’ve attempted increasing the target_accept parameter and running longer chains, but the problems persist.

Finally, I have encountered an error

Error computing IRF: LinAlgError. Details: SVD did not converge in Linear Least Squares
Unable to compute IRF

That is for all countries of the sample. I am interested in IRF for the panel structure, not by country

Questions:

  • How can I improve convergence in TVP-PVAR models using PyMC?
  • Are there recommended reparameterization techniques or prior distributions for this type of model?
  • What alternative approaches can address convergence issues in this scenario?

Additional Information:

  • The current sampling process for one country takes approximately 23 hours, significantly higher than the initial 2 hours. I have 60 countries in the sample, making computational efficiency crucial.
  • I’m using joblib.Parallel with a 1-second delay between iterations for better memory management.

Thank you in advance for your time and insights! Any suggestions or guidance on improving convergence and computational efficiency would be greatly appreciated. Also, I would be grateful if you can help me on how to refine my code

Sincerely,

Dimitri

Libraries Used:

  • PyMC (pymc) for probabilistic programming and MCMC sampling.
  • NumPy (numpy) for numerical computations.
  • Statsmodels (statsmodels) for time series analysis (used for stationarity testing and VAR model fitting in this example).