Probabilistic PCA and identification

Velochy · April 18, 2025, 9:42am

In case anyone stumbles on this question:

There is actually a theoretically sound identification that is based on LQ decomposition.

To briefly sum it up: if we have a matrix of weights W and scores S, if W @ S is fixed, both W and S are unidentified up to scale, permutation, rotation and sign flip (as W can compensate for any change in S and vice versa). Putting normal priors on the elements of both gets rid of scale issues, but permutation, rotation and sign flips are still a thing.

However, if you take the LQ factorization of W so W = L @ Q, you end up with L @ Q @ W
Q is orthogonal, so we can just subsume it in S so S’ = Q @ S and if all elements of S are symmetrically distributed, it will not change the distribution.
L is lower triangular, and as D. Leung and M. Drton. Order-invariant prior specification in Bayesian factor analysis prove, if original W was i.i.d normal, then L is also i.i.d normal on the lower triangle and has Chi distribution (they don’t name it, but it is) on the diagonal.

So, with this, identification becomes very easy.

import pymc as pm, pytensor.tensor as pt
import numpy as np
import pymc_extras as pmx


# Output dimension, real dimension, factor dimension, number of samples
od,rd,fd,n = 10,7,5,300

# Create a dataset to test the FA
real_mat = np.random.normal(0,1,size=(rd,od))
real_scores = np.random.normal(0,1,size=(n,rd))
obs = real_scores @ real_mat + np.random.normal(0,0.1,size=(n,od))

with pm.Model() as model:

    # Create the W matrix
    tu_inds = np.triu_indices(n=fd,m=od,k=1)
    wvals = pm.Normal('wvals',size=len(tu_inds[0]))
    wdvals = pmx.Chi('wdvals',size=fd,nu=fd-np.arange(fd))
    W = pt.zeros((fd,od))
    W = pt.set_subtensor(W[tu_inds],wvals)
    W = pt.set_subtensor(W[np.arange(fd),np.arange(fd)],wdvals)
    pm.Deterministic('W',W)
    
    # Non-marginalized model that directly gives the scores
    scores = pm.Normal('scores', mu=0, sigma=1, shape=(n,fd))
    sd = pm.HalfNormal('sd',sigma=0.5)
    pm.Normal('obs', mu=scores @ W, sigma=sd, shape=(n,od), observed=obs)
    
    # Alt: Marginalized model that leaves scores implicit
    #pm.MvNormal('obs', mu=0, cov=W.T @ W + 0.1*pt.eye(od), shape=(n,od), observed=obs)

    idata = pm.sample(tune=2000)

And to check if all chains converged to the same solution:

import pandas as pd
from plotnine import *

pvals = np.swapaxes(idata.posterior.W.values,-1,-2)@random_map

# Create a random map into two dimensions
random_map = np.random.normal(0,1,size=(fd,2))

# Reshape pvals into longform dataframe with category columns
df_long = pd.DataFrame([
    {
        'chain': chain_idx,
        'point': point_idx,
        'x': pvals[chain_idx, draw_idx, point_idx, 0],
        'y': pvals[chain_idx, draw_idx, point_idx, 1]
    }
    for chain_idx in range(pvals.shape[0])
    for draw_idx in range(pvals.shape[1]) 
    for point_idx in range(pvals.shape[2])
])

ggplot(df_long,aes(x='x',y='y',color='factor(point)')) + geom_point(alpha=0.3) + facet_wrap('~chain')

Usually, they do, and the result is:

Topic		Replies	Views
Unique solution for probabilistic PCA Questions	47	4157	March 8, 2019
Proposing some new distributions for pymc_extras Development	3	64	April 22, 2025
Breaking symmetries in Bayesian MDS Questions	4	549	January 23, 2020
How to model correlation matrices? Questions	9	1277	March 27, 2021
Rookie Question On Combining PCA and Bayesian Inference version agnostic modeling	12	1096	January 19, 2023

Probabilistic PCA and identification

Related topics