# Unable to find maximum with NUTS. Too many local maxima maybe?

I’m currently trying to sample from my model using the default NUTS sampler but I have problems finding the the maximum that I want (currently testing on known data).

Here’s my model:

``````amplitude_model = pymc3.Model()
with amplitude_model:
#     amplitude = pymc3.Uniform('amplitude', 1, 40) # uniform prior for A
l = pymc3.Uniform('l', 0, 127)
#     m = pymc3.Uniform('m', 0, 64)
source_A = pointSourceExt_theano2(128,25,l,55)

vis_A = theano_RIME_simple(source_A, uvstack, 128) # calculate visibilites using RIME

vis_A_flat = nonzeroElements(vis_A) # only need nonzero elements
vis_A_flat = vis_A[vis.nonzero(),vis.nonzero(),:]

vis_flat = vis[vis.nonzero()] # same here
vis_flat_stack = np.stack((vis_flat.real,vis_flat.imag), axis=1)

cov = 0.5*np.eye(vis_flat_stack.shape*2,vis_flat_stack.shape*2)

sigma=0.1
pymc3.MvNormal('visR', mu=vis_A_flat.flatten(), cov=sigma*cov ,observed=vis_flat_stack.flatten())#vis_A_flat.flatten()
``````

The purpose is to find the location of an point source in a sky model on the l-axis. The data points that I measure are called visibilities and they result from the fourier transform of the sky model.

My likelihood is a multivariate normal distribution with dimension 2*N, since I have N measurements in complex space.

The sampled posterior looks like this: My chains seem to freeze on local maxima. Any ideas how to fix this? The true maximum should be at 45.

Without a fully reproducible example, it’s challenging to figure out where the pathologies in this problem come from. Some thoughts:

• Can you let `sigma` have a prior distribution like `pm.HalfNormal('sigma', sd=0.1)` instead?

• If you are using a diagonal covariance matrix, you can just as easily use `pm.Normal('visR', mu=vis_A_flat.flatten(), sigma=sigma ,observed=vis_flat_stack.flatten()))` since it’s the same likelihood, just faster to evaluate.

• Using the `pm.Metropolis()` backup as the sampler, can you get any of your chains to find the maximum? You could try using some large number of chains while sampling to test this out.

• This is probably a parameterization issue for your model; issues like these are not terribly uncommon when a primarily physics-driven / engineering model is plugged into and MCMC framework - it tends to reveal when the model has sharp posterior gradients or has identifiability issues without the use of strong priors.

2 Likes

This actually solved my problem! I think what helped the most was switching from an `MvNormal` with a fixed value for `sigma` to and a `Normal` with a prior for `sigma`.