I have a dataset of measured compounds, and I’m trying to estimate their origin using CFA. My idea is that I will propose 2 latent variables, which represent some notion of human and natural activity and use the loadings to determine each observed variables’ origin. I’m relatively new to for CFA/SEM and PyMC, so I’m working off a modified version of this excellent blog post.
I started off with a sort of toy example assigning only 3 vars to each latent variable, and it seems to be working great. The models’ predictions seem to agree to a large extent with an unrelated Linear Regression Approach. My full dataset is pretty misshapen at 42x91. When I tried increasing the number of total variables to 28ish I’m getting problems, MCMC cant seem to resolve the posterior with the usual errors blah_blah.ravel() is zero. The problems seem to revolve around \nu and the LKJ prior. Here’s my slightly incoherent code:
def bayesian_confirmatory_factor_analysis(observed_data:pd.DataFrame, design_matrix:pd.DataFrame,β:str = "estimate", ν_σ:float = 2.5, ν_μ:float = 0.0, η_μ:float = 0.0, η_σ:float = 2.5, d_β:float = 2.5,obs_corr:bool = True, latent_corr:bool = True, Ω_η:float = 10,exp_λ:float = 1.0, Ψ_η:float = 1,Β_α:float = 1, Β_β:float =1,ϕ_μ:float = 0 ,ϕ_σ:float = 1.0 )->None: cfa_dims = namedtuple('cfa_dims', ['observations', 'observed_variables', 'latent_variables']) dims = cfa_dims(observations= observed_data.shape,observed_variables = observed_data.shape, latent_variables = design_matrix.shape) # μ = ν + Λ θ, where θ is the latent variable vector ν = pmc.Normal(name = 'ν',mu = ν_μ, sigma = ν_σ, shape = dims.observed_variables, ) PosNormal = pmc.Bound(pmc.Normal, lower = 0.0) η = PosNormal(name = 'η',mu =0,sigma = 0.1 , shape = dims.latent_variables) Δη = pmc.Deterministic('Δη', η-η) # Λ is a pxm factor loading intercept matrix. It derives from # Λ = Φ [β(1-M)+M] # Priors on unscaled factor loadings # Φ = pmc.HalfNormal(name = 'Φ', sigma = 10) Φ = pmc.HalfNormal(name = 'Φ', sd = ϕ_σ, shape = (dims.observed_variables, dims.latent_variables)) if isinstance(β,str): assert β == "estimate", f"β must be 'estimate' or a float" β = pmc.Beta(name = 'β', alpha = Β_α , beta = Β_β) # axis 1 destroyed. Must recreate M = np.asarray(design_matrix.values) # Generate scaled factor loadings f0 = np.zeros((dims.observed_variables,dims.latent_variables)) f1 = np.ones((dims.observed_variables,dims.latent_variables)) f0[11,0] = 1 f0[2,1] = 1 f1[11,0] = 0 f1[2,1] = 0 Λ0 = pmc.Deterministic('Λ0', Φ * (β*(1-M) + M )) Λ = pmc.Deterministic('Λ', (Λ0/ pmc.math.concatenate([ pmc.math.sum(Λ0,axis=1,keepdims=True), pmc.math.sum(Λ0,axis=1,keepdims=True)],axis=1 ))*f1 + f0 ) # μ = ν + Λ θ μ = pmc.Deterministic('μ',ν + pmc.math.matrix_dot(Λ, η)) # The convariance matrix is Σ = ΛΨΛ + Ρ # Ρ is the observed variable covariance matrix Ρ = DΩD # Priors for the item standard deviations D = pmc.HalfCauchy(name = "D", beta= d_β, shape = dims.observed_variables, ) #Priors for item correlations observed_correlations = pmc.LogNormal.dist(mu = 0.,sigma=0.25) # observed_correlations = pmc.LogNormal.dist(mu = 0, sd = 0.25) if not obs_corr: # If observations are not allowed to correlate # with each other Ω is an identity matrix Ω = np.eye(dims.observed_variables) else: G = pmc.LKJCholeskyCov(name = "G", eta = Ω_η, n = dims.observed_variables, sd_dist = observed_correlations) χ1 = pmc.expand_packed_triangular(dims.observed_variables, G,lower = True) K = pmc.math.dot(χ1, χ1.T) sd1 = theano.tensor.sqrt(theano.tensor.diag(K)) Ω = pmc.Deterministic("Ω",K /sd1[:,None]/sd1[None,:]) # Residual item variances and covariances Θ = pmc.Deterministic("θ", D[None,:]*Ω*D[:,None]) # Priors for factor correlations if not latent_corr: Ψ = np.eye(dims.latent_variables) else: L = pmc.LKJCholeskyCov(name = "L", eta = Ψ_η, n = dims.latent_variables, sd_dist = observed_correlations) χ = pmc.expand_packed_triangular(dims.latent_variables,L, lower = True) Ψ = pmc.Deterministic("Ψ", pmc.math.dot(χ,χ.T)) # _ψ = pmc.math.dot(χ,χ.T) # sd = theano.tensor.sqrt(theano.tensor.diag(_ψ)) # Ψ = pmc.Deterministic("Ψ", _ψ / sd[:,None]/sd[None,:]) # Variances and covariances for items Σ = pmc.math.matrix_dot(Λ,Ψ,Λ.T) + Θ # Priors on observations Y = pmc.MvNormal(name = "Y", mu = μ,cov = Σ, observed = observed_data.values, shape = observed_data.shape) scaled = pd.DataFrame(data = scale(f), columns = f.columns, index = f.index) with Model() as cfa_model: bayesian_confirmatory_factor_analysis(scaled,get_design_matrix(f, dmatrix ),latent_corr = True,Ω_η = 0.001, ν_σ= 0.1, ν_μ = 0, η_μ = 0.8, η_σ = 0.1)
Any ideas why MCMC cant finish running this? Not sure how to run prior checks in something this complex, I’ve looked at distplots for various params and things look reasonable save for some strange low density extremes values at 10 \sigma . The problems seem to revolve around \nu and \Sigma
Omega has some odd behavior. I Initially is set \eta = 0.0001 since I know my variables have very strong correlations with each other and while this samples normally as far as I can tell, I cant sample from the prior. I’m getting LinAlg error “matrix not positive definite”. My understanding was LKJ guarantees positive,definite matrices