Unable to find maximum with NUTS. Too many local maxima maybe?

StFroese · February 18, 2022, 10:49am

I’m currently trying to sample from my model using the default NUTS sampler but I have problems finding the the maximum that I want (currently testing on known data).

Here’s my model:

amplitude_model = pymc3.Model()
with amplitude_model:
#     amplitude = pymc3.Uniform('amplitude', 1, 40) # uniform prior for A
    l = pymc3.Uniform('l', 0, 127) 
#     m = pymc3.Uniform('m', 0, 64)
    source_A = pointSourceExt_theano2(128,25,l,55) 

    vis_A = theano_RIME_simple(source_A, uvstack, 128) # calculate visibilites using RIME

    vis_A_flat = nonzeroElements(vis_A) # only need nonzero elements
    vis_A_flat = vis_A[vis.nonzero()[0],vis.nonzero()[1],:]
    
    vis_flat = vis[vis.nonzero()] # same here
    vis_flat_stack = np.stack((vis_flat.real,vis_flat.imag), axis=1)

    
    cov = 0.5*np.eye(vis_flat_stack.shape[0]*2,vis_flat_stack.shape[0]*2)


    sigma=0.1
    pymc3.MvNormal('visR', mu=vis_A_flat.flatten(), cov=sigma*cov ,observed=vis_flat_stack.flatten())#vis_A_flat.flatten()

The purpose is to find the location of an point source in a sky model on the l-axis. The data points that I measure are called visibilities and they result from the fourier transform of the sky model.

My likelihood is a multivariate normal distribution with dimension 2*N, since I have N measurements in complex space.

The sampled posterior looks like this:
posteriot

My chains seem to freeze on local maxima. Any ideas how to fix this? The true maximum should be at 45.

ckrapu · February 19, 2022, 3:58pm

Without a fully reproducible example, it’s challenging to figure out where the pathologies in this problem come from. Some thoughts:

Can you let sigma have a prior distribution like pm.HalfNormal('sigma', sd=0.1) instead?
If you are using a diagonal covariance matrix, you can just as easily use pm.Normal('visR', mu=vis_A_flat.flatten(), sigma=sigma ,observed=vis_flat_stack.flatten())) since it’s the same likelihood, just faster to evaluate.
Using the pm.Metropolis() backup as the sampler, can you get any of your chains to find the maximum? You could try using some large number of chains while sampling to test this out.
This is probably a parameterization issue for your model; issues like these are not terribly uncommon when a primarily physics-driven / engineering model is plugged into and MCMC framework - it tends to reveal when the model has sharp posterior gradients or has identifiability issues without the use of strong priors.

StFroese · February 22, 2022, 11:44am

This actually solved my problem! I think what helped the most was switching from an MvNormal with a fixed value for sigma to and a Normal with a prior for sigma.

Topic		Replies	Views
Feeding NUTS initialization of model parameters Questions	3	693	April 19, 2023
NUTS - FromFunctionOp has no attribute 'grad' Questions theano	2	996	August 3, 2018
NUTS sampler converges to wrong value Questions	7	794	September 9, 2020
Using pymc3 NUTS sampler for an "external model" having its derivatives with respect to its parameters Questions	4	2364	March 30, 2018
NUTS speeds issue in model? Questions	1	442	January 28, 2019

Unable to find maximum with NUTS. Too many local maxima maybe?

Related topics