Traceplot error when modelling multivariate normal with LKJCholeskyCov

wjgregory92 · January 11, 2021, 10:27am

I am trying to model the problem below, which is very similar to the example posted here. The sampling runs fine but the traceplot fails with error: ValueError: cannot convert float NaN to integer. The only thing I can think of is that in the trace summary az.summary(trace, var_names=["~L"],hdi_prob=0.95,round_to=2) the r_hat for the first entry of the correlation matrix is NaN. Is there any way to get traceplot to ignore this? or is there something else going wrong here?

Many thanks

import numpy as np
import pymc3 as pm
import arviz as az

#generate synthetic data set
n = 500
mu = np.array([0,0])
sigma1 = 2
sigma2 = 1.5
r = -0.7
Sigma = np.reshape([sigma1**2,r*sigma1*sigma2,r*sigma1*sigma2, sigma2**2],(2,2))
D = np.random.multivariate_normal(mu,Sigma,size=n)

#model with pymc3
with pm.Model() as model:
   
    L,R,sigma = pm.LKJCholeskyCov('L', n=2,
                                 eta=1., sd_dist=pm.HalfCauchy.dist(1),compute_corr=True)

    mu = pm.Normal('$\mu$',0,1.5,shape=2,testval=0)
    cov = pm.Deterministic('$\Sigma$', L.dot(L.T))
    
    likelihood = pm.MvNormal('obs', mu=mu, chol=L, observed=D)
    
    warnings.filterwarnings("ignore")
    trace = pm.sample(1000,chains=4,cores=2,progressbar=True,random_seed=1,\
                      init="adapt_diag")#, return_inferencedata=True)

az.plot_trace(
    trace,
    var_names=["~L"],
    compact=True,
);

linehammer · May 17, 2021, 8:44am

You can avoid this with a mask method. Note first that in python NaN is defined as the number which is not equal to itself:

float(‘nan’) == float(‘nan’)
False

The ValueError: cannot convert float NaN to integer raised because of Pandas doesn’t have the ability to store NaN values for integers. From Pandas v0.24, introduces Nullable Integer Data Types which allows integers to coexist with NaNs. This does allow integer NaNs . This is the pandas integer, instead of the numpy integer. So, use Nullable Integer Data Types (e.g. Int64).

df['x'].astype('Int64')

NB: You have to go through numpy float first and then to nullable Int32, for some reason.

OriolAbril · May 17, 2021, 10:34am

plot_trace doesn’t calculate nor use rhat so there has to be something else going wrong. Which ArviZ version are you using? It could be an error in computing the kde precisely because the first entry is a constant and would have dirac delta as a pdf.

It may also be worth to consider not plotting the diagonal of cov matrices in traceplots. There is an example at Redirecting to new ArviZ documentation host: ReadTheDocs (note that the labeller functionality is only available in ArviZ development version, but selecting and ordering will work with any version).

Topic		Replies	Views
Cholesky Factor Invariance and Traceplot Errors version agnostic covariance , arviz	2	684	September 2, 2022
Plot_trace error: cannot convert float infinity to integer v3 arviz	5	3562	March 2, 2022
Traceplot: ValueError: zero-size array to reduction operation maximum which has no identity Questions	10	8532	September 11, 2017
Error in sample_posterior_predictive when using trace from dataframe Questions	1	899	September 6, 2019
PyMC v5.14.0 introduces error in MultivariateNormal with NaN v5 development , bug , modeling	3	128	May 14, 2024

Traceplot error when modelling multivariate normal with LKJCholeskyCov

Related topics