find_MAP struggles with hierarchical model

cosine · February 19, 2025, 10:22am

Hi
I’m using pymc on a hierarchical binomial model. The logit of the probability is a Gaussian with a certain std.dev. that I am trying to recover (called ‘si’ in the code’).

While the MC samples look fine, find_MAP() does not seem to find the right values when data is limited (N=100 in the code). In particular, it underestimates its value, even when initialized with the ground truth value.

Note, for sufficient data (N=1000 in the script), things look fine.

Is there a way to make find_MAP more robust when data is limited?

import pymc as pm
import numpy as np
import arviz as az
import matplotlib.pyplot as plt
import sys

plt.ion()
plt.close("all")

M=100  #  number of experiments
N=100  #  binomial N per experiment
# when N=1000, MAP solution looks fine.
# but when N=100, MAP finds a much too low value for sigma.

N_ar= np.array( [N]*M )
pbin = 0.5 # binomial probability

Xp  = np.log(pbin/(1-pbin)) #logit of p
siX  = 0.2   # add variability to logit p
p_ar =  1/(1 + np.exp(-np.random.normal(Xp,siX, M)))
print('ground truth:  mean(X)= ', Xp, ', sigma(X)= ', siX)

# generate data
k_ar=np.random.binomial(N_ar, p_ar )

with pm.Model():
    mu = pm.Normal("mu", 0, sigma=10)
    si = pm.HalfNormal("si", 1) # pushes to be small
    lth = pm.Normal("lth", mu, si, shape = N_ar.size )
    yobs = pm.Binomial("yobs", n=N_ar, logit_p = lth, observed=k_ar)
    trace = pm.sample(cores=4, chains =4, target_accept=0.95)

    MAP =pm.find_MAP(start = {'si': siX, 'mu':Xp} )
    print('MAP results: mu = ', MAP['mu'], 'si = ', MAP['si'])  # si is wrong when N=100

    # MAP =pm.find_MAP(method='Powell')  #even worse
    # print('MAP results with Powell: ', MAP)

    print('MC results: ', pm.stats.summary(trace, var_names=['mu','si'] ) ) #this works fine

jessegrabowski · February 19, 2025, 10:44am

This is a known phenomenon, and basically the reason why MAP estimation has fallen out of favor. You can somewhat mitigate it by using a zero-avoiding prior on the sigma of the group-level effect (e.g. si = pm.Gamma('si', alpha=2, beta=0)). This is the advice of the STAN prior FAQ.

In general, if you have a hierarchical model you should be doing MCMC.

cosine · February 19, 2025, 11:20am

thank you for the quick reply.
(This nice community is sufficient reason to do MCMC:)

jessegrabowski · February 19, 2025, 12:12pm

If you really need a faster approximation than MCMC, try ADVI or pathfinder. We just buffed the implementation of pathfinder in pymc-extras, so that’s worth a second look if you’ve tried it before. There’s also a MAP+Laplace approximation in extras that can be good too (if MAP works on your problem in the first place, of course). Example of that one is here.

Topic		Replies	Views
Bambi - find_map() version agnostic bambi	4	388	January 29, 2024
Find_MAP when optimizing a Potential	9	533	June 9, 2023
Inference from binary data & proper way to use GaussianRandomWalk Questions	2	789	September 15, 2017
Using find_MAP and hessian to find frequentist standard errors, t-stats and p-values Questions	2	1458	March 31, 2020
A little change to the sample code of MAP, however Questions	10	1116	April 1, 2018

find_MAP struggles with hierarchical model

Related topics