Positive Log Marginal Likelihood with SMC Sampler

Hello ,

I’m encountering an unusual issue while using SMC sampling to compute the log marginal likelihood for a multivariate skew normal model. The computed log marginal likelihood is positive (90.31), which seems mathematically unexpected since log probabilities should typically be negative.

Here’s my model setup:

def multivariate_skew_normal_logp(value, xi, chol, alpha):
    # Custom logp function for multivariate skew normal distribution
    k = xi.shape[0]
    diff = value - xi
    z = pt.slinalg.solve_triangular(chol, diff.T, lower=True).T
    log_det = 2 * pt.sum(pt.log(pt.diag(chol)))
    
    mahalanobis = pt.sum(z**2, axis=1)
    mv_normal_logp = -0.5 * k * pt.log(2 * np.pi) - 0.5 * log_det - 0.5 * mahalanobis
    
    Omega = pt.dot(chol, chol.T)
    omega = pt.sqrt(pt.diag(Omega))
    z_standardized = diff / (omega + 1e-8)
    skew_argument = pt.sum(z_standardized * alpha, axis=1)
    norm_cdf = 0.5 * (1 + pt.erf(skew_argument / pt.sqrt(2)))
    skew_logp = pt.log(norm_cdf)

    constant_part = pt.log(2.0)  
    total_logp = constant_part + mv_normal_logp + skew_logp
    return total_logp

with pm.Model() as skew_normal_model:
    num_features = Theta_normalized.shape[1]
    mu = pm.Normal('mu', mu=0, sigma=1, shape=num_features)
    chol, corr, stds = pm.LKJCholeskyCov(
        'chol_cov',
        n=num_features,
        eta=10,
        sd_dist=pm.HalfNormal.dist(sigma=1.0),
        compute_corr=True
    )
    alpha = pm.Normal('alpha', mu=1, sigma=1, shape=num_features)
    obs = pm.DensityDist(
        'obs',
        mu, chol, alpha,
        logp=multivariate_skew_normal_logp,
        observed=Theta_normalized
    )
    
with skew_normal_model:
    trace_smc_skew_cov = pm.sample_smc(5000, chains=8)

The SMC sampling shows some convergence issues (Rhat ≈ 1.02 for some parameters, and NaN for chol_cov_corr[0, 0]). However, when I sample the same model using NUTS

with skew_normal_model:
trace_sn = pm.sample(
5000,
tune=5000,
chains=8,
target_accept=0.9,
nuts={‘max_treedepth’: 20},
init=“advi+adapt_diag”
)

All parameters show good convergence (Rhat ≈ 1.0,but a NaN for chol_cov_corr[0, 0]).

My questions:

  1. Why might SMC sampling produce a positive log marginal likelihood?

  2. Could this be related to SMC sampler settings or the model construction?

  3. Are there specific SMC parameters I should adjust to improve convergence and get reliable marginal likelihood estimates?

Any insights would be greatly appreciated!

Likelihood/densities aren’t constrained between 0/1. The density of a uniform distribution between (0, 1/2) is 2

I think the marginal likelihood is the integral of the probability density function in the numerator part of Bayes’ formula, so its value should be between 0 and 1?
I’m using Bayes factors for model comparison, and the other three models are: a diagonal multivariate normal, a multivariate normal using a covariance matrix, and fitting a skew normal distribution separately for each dimension. Their log marginal likelihoods are all negative: -60, -65, and -48 respectively. From the PPC (posterior predictive checks), their fits all appear better than this multivariate skew normal model. However, because this multivariate skew normal model has a positive log marginal likelihood, the Bayes factor incorrectly suggests it’s the best model. So I suspect there might be issues with the sampling or the model itself.

The marginal likelihood is not a probability, but a likelihood function. It doesn’t have to be bound by 1, otherwise it couldn’t be used to normalize the numerator in bayes term when that integrates to > 1 over the parameter space.

Now to your problem, are you sure you wrote the skewnormal density correctly?

Thanks,I understand why the log marginal likelihood is positive.
As for the skew mvnormal,I followed the approach in Azzalini A.'s book “The Skew-Normal and Related Families” (Cambridge University Press, 2013) to define the skew-normal distribution. The definition method is: 2 ϕ_d(x−ξ; Ω) Φ(α^⊤ ω^{-1}(x−ξ)), where:

  • ϕ_d is the probability density function of the d-dimensional multivariate normal distribution
  • Φ is the cumulative distribution function of the univariate standard normal distribution
  • ω = (Ω ⊙ I_d)^{1/2}, and ⊙ denotes the entry-wise or Hadamard product

I have verified that the probability distribution I computed using this function matches the probability distribution plots shown in the book.
Additionally, the sampling results for chol_cov_corr[0, 0] in my model appear abnormal - its trace shows as a flat line, and the r_hat value displays as None (while other parameter traces look relatively normal with r_hat=1.0). Could this be causing the abnormal final results?

The diagonal of the correlation matrix is deterministically 1, so it breaks all of our sampling diagnostics. It’s expected and not issue (but quite annoying)