Unexpected log likelihood results

tnocita · November 22, 2025, 5:13am

Hello, I have been working on a bayesian poisson GLM using pymc and am interested in analyzing the log likelihood at certain points. To do this I am using model.compile_logp(vars=model.observed_RVs)(point). However, I notice the values of the log likelihood are slightly different than those produced by manually calculating the log likelihood. I have also confirmed that my manual implementation matches the log likelihood produced from fitting an equivalent frequentist model using statsmodels. Any help explaining this discrepancy would be much appreciated!

ricardoV94 · November 22, 2025, 7:58am

Can you share code that reproduces the mismatch you’re seeing

tnocita · November 23, 2025, 6:33pm

data = prepare_data()

coords = {
    "cat1_level": data["cat1_names"],
    "cat2_level": data["cat2_names"],
    "cat3_level": data["cat3_names"],
}

with pm.Model(coords=coords) as model:
    X1_data = pm.Data("X1_data", data["X1"])
    X2_data = pm.Data("X2_data", data["X2"])
    X3_data = pm.Data("X3_data", data["X3"])

    alpha = pm.Normal("alpha", mu=0.0, sigma=SIGMA_PRIOR)
    beta1 = pm.Normal("beta1", mu=0.0, sigma=SIGMA_PRIOR, dims="cat1_level")
    beta2 = pm.Normal("beta2", mu=0.0, sigma=SIGMA_PRIOR, dims="cat2_level")
    beta3 = pm.Normal("beta3", mu=0.0, sigma=SIGMA_PRIOR, dims="cat3_level")
    
    linear_term = (
        alpha
        + pm.math.dot(X1_data, beta1)
        + pm.math.dot(X2_data, beta2)
        + pm.math.dot(X3_data, beta3)
    )

    mu = pm.Deterministic("mu", pm.math.exp(linear_term))

    pm.Poisson("y_obs", mu=mu, observed=data["y"])

    map_estimate = pm.find_MAP()

    logp_fn = model.compile_logp(vars=model.observed_RVs)

map_point = {
    "alpha": map_estimate["alpha"],
    "beta1": np.asarray(map_estimate["beta1"]),
    "beta2": np.asarray(map_estimate["beta2"]),
    "beta3": np.asarray(map_estimate["beta3"]),
}

pymc_log_likelihood = float(logp_fn(map_point))

mu = np.exp(
    map_point['alpha']
    + data["X1"].dot(map_point["beta1"])
    + data["X2"].dot(map_point["beta2"])
    + data["X3"].dot(map_point["beta3"])
)

manual_log_likelihood = np.sum(
    data["y"] * np.log(mu) - mu - gammaln(data["y"] + 1.0)
)

print("\n=== PyMC Log Likelihood ===")
print(f"Log likelihood at MAP: {pymc_log_likelihood:.6f}")

print("\n=== Manual Log Likelihood ===")
print(f"Manual log likelihood at MAP: {manual_log_likelihood:.6f}")

=== PyMC Log Likelihood ===
Log likelihood at MAP: -9115.067879

=== Manual Log Likelihood ===
Manual log likelihood at MAP: -11810.939595

jessegrabowski · November 23, 2025, 6:38pm

Your manual computation looks like it’s only the log-likelihood without the contributions from the priors

tnocita · November 23, 2025, 6:41pm

I was under the impression that setting vars=model.observed_RVs in compile_logp() would only calculate the log likelihood rather than the full log posterior

ricardoV94 · November 23, 2025, 7:33pm

You’re correct, I think @jessegrabowski missed that point

ricardoV94 · November 23, 2025, 7:35pm

This is how the Poisson logp is implemented in PyMC: pymc/pymc/distributions/discrete.py at f69e159cb8344da8576cdb6b49fc81be766dec5e · pymc-devs/pymc · GitHub

ricardoV94 · November 23, 2025, 7:50pm

Other than the special behavior of logpow I could imagine some float precision differences? PyTensor won’t do the naive log(exp(mu))

jessegrabowski · November 23, 2025, 11:18pm

Yes I missed that, sorry

tnocita · November 24, 2025, 3:41am

I found the issue. Many of my y values were not exact integers (eg 1.000001, 0.999999) and rounding to the nearest integer in the preprocessing seems to have solved this. Thanks for your help with this.

Topic		Replies	Views
Log-scale transform v5 development , theano , prior , modeling , pytensor	16	812	July 18, 2023
Meaning of model.logp, a beginner question Questions	4	1027	July 12, 2019
Log likelihood plot for pymc3 model vs scipy stats distributions v3 modeling	9	1119	June 7, 2022
Evaluating log posterior (or log prior/log likelihood) of model at arbitrary inputs v5 development	1	532	August 29, 2022
Point-wise log-likelihood for black-box model in PyMC v4 v5	2	1000	June 10, 2022

Unexpected log likelihood results

Related topics