Help! "Unsupported dtype for TensorType: object (PyMC)

Hi everyone,

I’m encountering an error during model fitting in my TVP-PVAR model using PyMC v5. The error message is:

Unsupported dtype for TensorType: object

I’ve tried several things to troubleshoot, but I’m still stuck. Here’s some background:

  • I have a custom class TVPPVARFitter that handles data loading, imputation, and model fitting.
  • I’ve tried various imputation methods (kalman, iterative, knn) to address missing values, but the error persists.
  • I’ve explicitly cast data to float32 before feeding it to the model (in the fit_tvp_pvar method).
  • I’ve verified that all variables in the data used for fitting are indeed of type float32 (shown in the data loading section).

Code Snippets:

  1. Data Loading (relevant part):
# Import libraries
import pandas as pd
import numpy as np
import pymc as pm  # PyMC 5.15.1
import arviz as az

# Load data from CSV file
df = pd.read_csv('data.csv')

# Filter data to include only records from 1990 onwards
df = df[df['year'] >= 1990]

# Create vars_to_float list with only numerical columns
vars_to_float = [col for col in df.columns if df[col].dtype in ['float64', 'int64']]

# Filter vars_of_interest to include only those in vars_to_float
vars_to_float = list(set(vars_to_float) & set(vars_of_interest))

# Ensure all variables in vars_to_float are treated as float32
df[vars_to_float] = df[vars_to_float].astype('float32')

# Print data types to verify they are float32
print(df[vars_to_float].dtypes)  # This should show all data types as 'float32'

df_float = df[vars_to_float].copy()
model = TVPPVARFitter(df_float, vars_to_float, p=1)  # Assuming p=1 for lag order
  1. fit_tvp_pvar method (relevant part):
def fit_tvp_pvar(self, num_iterations=10000, burn=5000, tune=5000, cores=1, delay=1):
  # ... (other parts of the method)

  # Imputation logic
  from sklearn.experimental import enable_iterative_imputer  # Import for iterative imputer
  from sklearn.impute import IterativeImputer, KNNImputer  # Imports for imputation methods

  imputation_methods = ['kalman', 'iterative', 'knn']
  imputed = False
  for method in imputation_methods:
    # ... (imputation steps using the chosen method)
    if not np.any(np.isnan(data)) and not np.any(np.isinf(data)):
      imputed = True
      self.n_vars = data.shape[1]
      break

  # Convert to float32 again (for safety)
  data = np.array(data, dtype=np.float32)

  # ... (rest of the model fitting logic)

  # Import PyMC and ArviZ
  import pymc as pm  # PyMC 5 for model definition
  import arviz as az  # ArviZ for diagnostics and visualization

  # ... (model definition and MCMC sampling using PyMC and ArviZ)

Questions:

  1. Has anyone encountered a similar “object” type error with PyMC?
  2. Are there any additional debugging strategies I can use to pinpoint the source of these non-numeric values (besides checking data types)?
  3. Could there be other reasons for this error besides missing values?

4 Could anyone help on how to modify the code?

I have not used pytensor. I suspect it was unnecessary. Pymc v 5 handles tensors internally.

Additional Information:

  • PyMC version: 5.15.1
  • Libraries used: numpy, pandas, sklearn, pymc, arviz, matplotlib, pykalman

Note: I haven’t included the entire fit_tvp_pvar method or data pre-processing steps (like filtering by year) for brevity. However, the provided snippets highlight the key areas related to data loading, imputation, and type casting.

pytensor

Thank you wholehardly in advance for any help you can provide

Best regards

Dimitri

Can you provide the entire error message?

Thanks for your help with the previous error! I’ve made some debugging changes and encountered a new error:

def fit_tvp_pvar(self, num_iterations=10000, burn=5000, tune=5000, cores=1, delay=1):
imputation_methods = [‘kalman’, ‘iterative’, ‘knn’]

imputed = False
for method in imputation_methods:
  try:
    if method == 'kalman':
      # Kalman Filter Imputation
      data = self.kalman_imputation(self.data[self.vars].values.astype('float32'))
      print("Imputation successful with Kalman filter method.")

    elif method == 'iterative':
      # Iterative Imputer
      imputer = IterativeImputer(max_iter=100, random_state=self.seed)
      with warnings.catch_warnings():
        warnings.filterwarnings("ignore", category=ConvergenceWarning)
        data = imputer.fit_transform(self.data[self.vars].values.astype('float32'))
      print("Imputation successful with iterative method.")

    elif method == 'knn':
      # KNN Imputer
      imputer = KNNImputer()
      data = imputer.fit_transform(self.data[self.vars].values.astype('float32'))
      print("Imputation successful with KNN method.")

    if not np.any(np.isnan(data)) and not np.any(np.isinf(data)):
      imputed = True
      self.n_vars = data.shape[1]
      break

    time.sleep(delay) # Add 1 second delay between iterations

  except Exception as e:
    print(f"{method} imputation failed: {e}")
    print("Trying the next method...")

if not imputed:
  raise ValueError("All imputation methods failed.")

# Convert to a clean numpy array with the correct dtype
data = np.array(data, dtype=np.float32)

try:
  with pm.Model() as tvp_pvar_model:
    # Priors
    sd_dist = pm.HalfNormal.dist(sigma=1.0)
    log_sd_vals = pm.Normal("log_sd_vals", mu=0, sigma=1, shape=(self.n_vars,))
    exp_log_sd_vals = pm.math.exp(log_sd_vals)
    sd_vals = pm.Deterministic("sd_vals", exp_log_sd_vals)
    alphas = pm.Normal("alphas", mu=0, sigma=1, shape=(self.n_obs, self.n_vars))

    # Initial state for hidden variables (replace with appropriate values based on your data)
    initial_state = np.zeros((1, self.p * self.n_vars))

    # State transitions using loop
    state_vars = []
    for t in range(self.n_obs):
      prev_state = state_vars[t - 1] if t > 0 else initial_state
      shocks = pm.Normal("shocks_t{}".format(t), mu=0.0, sigma=sd_vals, shape=(self.n_vars,))
      new_state = np.zeros_like(prev_state)

      # Loop through previous states and shocks to accumulate state updates
      for lag in range(self.p):
        new_state += pm.math.dot(alphas[t - lag - 1], shocks[t - lag])

      state_vars.append(new_state)

    # Likelihood
    etas = pm.Normal("etas", mu=pm.math.dot(state_vars[0], pm.math.repmat(np.arange(self.p)[::-1], self.n_obs, 1)),
             sigma=sd_vals, observed=data)

  # MCMC sampling
  trace = pm.sample(draws=num_iterations, tune=tune, cores=cores)
  self.trace = trace
except RecursionError as e:
  print(f"RecursionError encountered: {e}. Increasing recursion limit.")
  sys.setrecursionlimit(10000)
  self.fit_tvp_pvar(num_iterations=num_iterations, burn=burn, tune=tune, cores=cores, delay=delay)
except Exception as e:
  import traceback
  print(traceback.format_exc())
  print(f"Error during model fitting: {e}")

i got

Traceback (most recent call last):
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/tensor/type.py”, line 293, in dtype_specs
return self.dtype_specs_map[self.dtype]
~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^
KeyError: ‘object’

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/var/folders/p_/4shh5wgx7vn_sp4_77tp6c7m0000gn/T/ipykernel_5819/1811677506.py”, line 95, in fit_tvp_pvar
new_state += pm.math.dot(alphas[t - lag - 1], shocks[t - lag])
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/tensor/variable.py”, line 201, in radd
return pt.math.add(other, self)
^^^^^^^^^^^^^^^^^^^^^^^^
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/graph/op.py”, line 292, in call
node = self.make_node(*inputs, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/tensor/elemwise.py”, line 481, in make_node
inputs = [as_tensor_variable(i) for i in inputs]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/tensor/elemwise.py”, line 481, in
inputs = [as_tensor_variable(i) for i in inputs]
^^^^^^^^^^^^^^^^^^^^^
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/tensor/init.py”, line 50, in as_tensor_variable
return as_tensor_variable(x, name, ndim, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/Users/dimitri/anaconda3/lib/python3.11/functools.py”, line 909, in wrapper
return dispatch(args[0].class)(*args, **kw)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/tensor/basic.py”, line 185, in as_tensor_numbers
return constant(x, name=name, ndim=ndim, dtype=dtype)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/tensor/basic.py”, line 238, in constant
ttype = TensorType(dtype=x
.dtype, shape=x
.shape)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/tensor/type.py”, line 122, in init
self.dtype_specs() # error checking is done there
^^^^^^^^^^^^^^^^^^
File “/Users/dimitri/anaconda3/lib/python3.11/site-packages/pytensor/tensor/type.py”, line 295, in dtype_specs
raise TypeError(
TypeError: Unsupported dtype for TensorType: object

Error during model fitting: Unsupported dtype for TensorType: object

Then , looking deeper I updated the version of the fit_tvp_pvar method with the debugging steps moved to the beginning of the method:

def fit_tvp_pvar(self, num_iterations=10000, burn=5000, tune=5000, cores=1, delay=1):
  imputation_methods = ['kalman', 'iterative', 'knn']

  imputed = False
  for method in imputation_methods:
    try:
      if method == 'kalman':
        # Kalman Filter Imputation
        data = self.kalman_imputation(self.data[self.vars].values.astype('float32'))
        print("Imputation successful with Kalman filter method.")

      elif method == 'iterative':
        # Iterative Imputer
        imputer = IterativeImputer(max_iter=100, random_state=self.seed)
        with warnings.catch_warnings():
          warnings.filterwarnings("ignore", category=ConvergenceWarning)
          data = imputer.fit_transform(self.data[self.vars].values.astype('float32'))
        print("Imputation successful with iterative method.")

      elif method == 'knn':
        # KNN Imputer
        imputer = KNNImputer()
        data = imputer.fit_transform(self.data[self.vars].values.astype('float32'))
        print("Imputation successful with KNN method.")

      if not np.any(np.isnan(data)) and not np.any(np.isinf(data)):
        imputed = True
        self.n_vars = data.shape[1]
        break

      time.sleep(delay) # Add 1 second delay between iterations

    except Exception as e:
      print(f"{method} imputation failed: {e}")
      print("Trying the next method...")

  if not imputed:
    raise ValueError("All imputation methods failed.")

  # Convert to a clean numpy array with the correct dtype
  data = np.array(data, dtype=np.float32)

  # Check for non-numeric values in the data
  print(f"Non-numeric values in data: {np.any(~np.isfinite(data))}")

  try:
    with pm.Model() as tvp_pvar_model:
      # Print data types before model creation
      print(f"data dtype: {data.dtype}")

      # Priors
      sd_dist = pm.HalfNormal.dist(sigma=1.0)
      log_sd_vals = pm.Normal("log_sd_vals", mu=0, sigma=1, shape=(self.n_vars,))
      exp_log_sd_vals = pm.math.exp(log_sd_vals)
      sd_vals = pm.Deterministic("sd_vals", exp_log_sd_vals)
      alphas = pm.Normal("alphas", mu=0, sigma=1, shape=(self.n_obs, self.n_vars))

      # Initial state for hidden variables (replace with appropriate values based on your data)
      initial_state = np.zeros((1, self.p * self.n_vars))

      # State transitions using loop
      state_vars = []
      for t in range(self.n_obs):
        prev_state = state_vars[t - 1] if t > 0 else initial_state
        shocks = pm.Normal("shocks_t{}".format(t), mu=0.0, sigma=sd_vals, shape=(self.n_vars,))
        new_state = np.zeros_like(prev_state)

        # Loop through previous states and shocks to accumulate state updates
        for lag in range(self.p):
          # Check data types of alphas and shocks
          print(f"alphas dtype: {alphas.dtype}")
          print(f"shocks dtype: {shocks.dtype}")

          # Check for non-numeric values in alphas and shocks
          print(f"Non-numeric values in alphas: {np.any(~np.isfinite(alphas))}")
          print(f"Non-numeric values in shocks: {np.any(~np.isfinite(shocks))}")

          # Convert alphas and shocks to numerical arrays
          alphas = np.array(alphas, dtype=np.float32)
          shocks = np.array(shocks, dtype=np.float32)

          new_state += pm.math.dot(alphas[t - lag - 1], shocks[t - lag])

        state_vars.append(new_state)

      # Likelihood
      etas = pm.Normal("etas", mu=pm.math.dot(state_vars[0], pm.math.repmat(np.arange(self.p)[::-1], self.n_obs, 1)),
               sigma=sd_vals, observed=data)

    # MCMC sampling
    trace = pm.sample(draws=num_iterations, tune=tune, cores=cores)
    self.trace = trace
  except RecursionError as e:
    print(f"RecursionError encountered: {e}. Increasing recursion limit.")
    sys.setrecursionlimit(10000)
    self.fit_tvp_pvar(num_iterations=num_iterations, burn=burn, tune=tune, cores=cores, delay=delay)
  except Exception as e:
    import traceback
    print(traceback.format_exc())
    print(f"Error during model fitting: {e}")

I’ve moved the debugging steps to the beginning of the try block, right after the with pm.Model() line. This way, the print statements will be executed before any PyMC operations are performed, and you should see the printed output regardless of where the error occurs within the with block.

Additionally, I’ve added a new print statement to check the data type of the data variable before creating the PyMC model:

print(f"data dtype: {data.dtype}")

where I got

kalman imputation failed: array must not contain infs or NaNs
Trying the next method…
Imputation successful with iterative method.
Non-numeric values in data: False
data dtype: float32
alphas dtype: float64
shocks dtype: float64
Traceback (most recent call last):
File “/var/folders/p_/4shh5wgx7vn_sp4_77tp6c7m0000gn/T/ipykernel_6927/1672946134.py”, line 106, in fit_tvp_pvar
print(f"Non-numeric values in alphas: {np.any(~np.isfinite(alphas))}")
^^^^^^^^^^^^^^^^^^^
TypeError: ufunc ‘isfinite’ not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ‘‘safe’’

Error during model fitting: ufunc ‘isfinite’ not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ‘‘safe’’

kalman imputation failed: array must not contain infs or NaNs
Trying the next method…
Imputation successful with iterative method.
Non-numeric values in data: False
data dtype: float32
alphas dtype: float64
shocks dtype: float64
Traceback (most recent call last):
File “/var/folders/p_/4shh5wgx7vn_sp4_77tp6c7m0000gn/T/ipykernel_6746/1672946134.py”, line 106, in fit_tvp_pvar
print(f"Non-numeric values in alphas: {np.any(~np.isfinite(alphas))}")
^^^^^^^^^^^^^^^^^^^
TypeError: ufunc ‘isfinite’ not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ‘‘safe’’

Error during model fitting: ufunc ‘isfinite’ not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ‘‘safe’’

So far as you can see

Debugging Steps Taken:

  1. Moved Debugging Statements: I moved the printing of data types and checks for non-numeric values to the beginning of the try block, ensuring they run before any PyMC operations.
  2. Checked Data Type: I added a print statement to verify the data type after conversion (data.dtype).

Current Observations:

  • The imputation seems successful (no NaNs or infs).
  • Data is converted to float32.
  • alphas and shocks data types are float64.

Possible Causes and Solutions:

Based on the error message and observations, it seems the issue might be related to the data types of alphas and shocks. Although the data itself is float32, PyMC might be creating these variables with a different default dtype (float64 in this case). This mismatch could cause problems with the isfinite function.

Additional Notes:

  • The kalman imputation method seems to be failing. Consider checking its implementation or trying alternative imputation techniques if necessary.

I appreciate any insights or suggestions you might have to resolve this new error.

Thanks,

Dimitri

Well, the first thing is that it’s generally not a great idea to be mixing numpy methods and PyTensor tensors. For example, you can try using PyTensor’s isinf() rather than numpy’s because alphas is a PyTensor tensor and not a numpy array.

I wanted to express my sincere gratitude for the assistance I received on my previous post regarding the “Unsupported dtype for TensorType: object (PyMC)] The insights you provided were incredibly helpful!

Update on PyMC Model Definition/Sampling Error:

Following your suggestions, I was able to successfully troubleshoot the problem. `

New Issue: Kalman Filter Implementation

While the PyMC model issue is resolved, I’ve encountered a new challenge related to Kalman Filter implementation. As some of you pointed out, pytensor doesn’t appear to have a built-in Kalman Filter class.

Here’s a link to a related discussion on the topic: Help Needed: Kalman Filter with PyMC and pytensor.tensor.KalmanFilter Error

I’m open to exploring alternative libraries or approaches for Kalman Filter imputation within my project. Any suggestions or recommendations on this front would be greatly appreciated.

Thanks again for your continued support!

Sincerely,
Dimitri