Simple exponential model fails to find map_estimate

codeman · July 1, 2021, 12:29am

Here is the model I’m using:

True parameter values

beta, gamma = 0.21, 0.07

Size of dataset

days = 50

Predictor variable

time = np.arange(0,days,1)

Simulate outcome variable

data = []
for t in time:
data.append((beta/((beta-gamma))(np.exp(t(beta-gamma))-1)+1) + np.random.normal(0,100)

basic_model = pm.Model()

with basic_model:

# Priors for unknown model parameters
beta = pm.Normal("alpha", mu=0, sigma=1)
gamma = pm.Normal("beta", mu=0, sigma=1)

# Expected value of outcome
mu = (beta/((beta-gamma))*(np.exp(time*(beta-gamma))-1)+1)

# Likelihood (sampling distribution) of observations
Y_obs = pm.Normal("Y_obs", mu=mu, sigma=100, observed=data)

But, when I run map_estimate, I simply get the mean of the priors for the estimate: (0,0).

What’s going on here?

cluhmann · July 1, 2021, 2:30am

Is there some particular reason you are using Metropolis and looking for the MAP? Permitting pm.sample() to select the sampling algorithm(s) it thinks are best (based on your model) is typically a good idea. For models with continuously distributed parameters, it defaults to NUTS, which is usually a good place to start. Have you just tried calling trace = pm.sample()?

codeman · July 1, 2021, 3:53am

I just tried that, but it unfortunately resulted in the same issue

codeman · July 1, 2021, 4:00am

I am essentially trying to use mcmc to determine the true parameters from the noisy data. If the program runs properly, it should produce the values (0.21,0.07), which I have set at the top of the code.

cluhmann · July 1, 2021, 6:41pm

I suspect that you are running into a couple of issues. First, your likelihood includes a term with beta-gamma in the denominator. Because your priors over these parameters are identical, this denominator is initially being zero, which will tend to kill sampling before it can begin. Second, you are trying to do inference on what is basically noise. The signal reflecting your beta and gamma parameters is buried in a huge amount of noise. So it’s going to be a challenge to recover the values of beta/gamma. This, coupled with the relatively uninformative priors, are likely to make sampling tough.

codeman · July 1, 2021, 7:30pm

After changing the function to account for when beta=gamma (to make sure the denominator is never 0), the sampling finally runs, however it gets the parameters extremely wrong, even when I turn the noise down to almost 0.

cluhmann · July 1, 2021, 7:31pm

Can you post the updated model code?

cluhmann · July 1, 2021, 7:51pm

There’s this as well:

That’s the joint posterior when I re-ran your model with zero noise (and slightly offset priors so that the beta-gamma wouldn’t immediately be zero). So the model you are working with seems overspecified/underconstrained by your data (which makes some sense looking at it, because each parameter mostly only shows up in the difference beta-gamma).

codeman · July 1, 2021, 8:01pm

Here is the updated code

# True parameter values
beta, gamma = 0.21, 0.07

# Size of dataset
days = 50

# Predictor variable
time = np.arange(0,days,1)

# Simulate outcome variable
data = []
for t in time:
    data.append((beta/((beta-gamma))*(np.exp(t*(beta-gamma))-1)+1) + np.random.normal(0,1))

fig, axes = plt.subplots(1, 1, sharex=True, figsize=(10, 4))
axes.scatter(time, data, alpha=0.6)
axes.set_ylabel("Y")
axes.set_xlabel("time")

basic_model = pm.Model()

def smodel(beta,gamma):
    if beta==gamma:
        return time
    s = beta/((beta-gamma))*(np.exp(time*(beta-gamma))-1)+1
    return s

with basic_model:

    # Priors for unknown model parameters
    beta = pm.Normal("beta", mu=1, sigma=10)
    gamma = pm.Normal("gamma", mu=0.07, sigma=10)

    y_obs = pm.Normal('obs', mu=smodel(beta,gamma), sigma=1,observed=data)
    
    # Draw the specified number of samples
    trace = pm.sample()#step=pm.Metropolis())

If I run this with NUTS, I get the error “Mass matrix contains zeros on the diagonal.” If I run it with Metropolis, it gets the posterior quite incorrectly.

As for the plot you made, that looks about right. However, with minimal noise, the algorithm should be able to infer that of those points, (beta=0.21,gamma=0.07) are the ones with the highest likelihood.

cluhmann · July 1, 2021, 8:09pm

Well that plot reflects the result with no noise at all. The problem is that, even under these artificially convenient circumstances, you still can’t be particularly certain about these values because (as that plot shows), lots of beta-gamma value pairs will get you approximately the same behavior out of your model. I suspect that this entirely collapses once you add sufficient noise to the data.

codeman · July 1, 2021, 8:22pm

When I use the same model and evaluate the posterior at 1000x1000 grid points, even with a decent amount of noise the grid point with the highest likelihood will be around (0.21,0.07), and the marginal posteriors seem to be distributed around the true values. So I suspect the problem is not with the model itself, but with either the sampling or the way I’m putting it into pymc3.

cluhmann · July 1, 2021, 8:40pm

I think the marginals look fine.

The problem is that the dependency between the parameter values creates uncertainty that is unresolvable given your data and your model. The MAP/MLE could be exactly (0.21,0.07). But introduce any minor perturbation of beta and you (or the sampler) can find a minor modification of gamma that largely compensates. This does (and should) cause you (and the sampler) to be uncertain about (0.21,0.07) being “the right answer”.

codeman · July 1, 2021, 9:16pm

You are exactly right, my goal of this analysis was to quantify how uncertain the posterior is with a weak prior on gamma vs with a strong prior on gamma. Also, the plots you are getting look a lot better than the ones I’m getting. Can you please send me the code you used? Maybe there’s an issue with the way the packages are installed on my computer?

cluhmann · July 1, 2021, 9:29pm

I made some minor modifications, but nothing that should matter too much. I modified the default sampler initialization because it jitters the start point and can cause problems when your model is sensitive to small changes in parameter values (which is the case here). For that last plot, I also centered the priors closer to the true parameter values, just to illustrate the uncertainty that persists under highly optimistic circumstances.

So just to be clear, I think there are two related but separate issues here. One is that there is a strong dependency between your two parameters. This should cause you be somewhat uncertain about any particular estimate. That much is fine in the sense that it as it should be (I think). The other issue is that the the strong dependency coupled with the model’s overall sensitivity to small changes to the parameter values is going to give the sampler quite a bit of trouble. The sampler is going to want to roll back and forth along that “ridge” depicted in the pairplot and any small deviation from that straight line is going to cause the model to deviate wildly from the data. Noisy data is not likely to help the situation much. Wide priors centered far from the true parameter values aren’t really helping out either.

Overall, this is a situation in which I might look for ways to respecify my model so that it was a bit less sensitive and so that I could specify priors that were simultaneously agnostic but not problematic.

import pymc3 as pm
import numpy as np
import pandas as pd
import arviz as az
import matplotlib.pyplot as plt

beta, gamma = 0.21, 0.07

days = 50
time = np.arange(0, days, 1)
data = beta / (beta-gamma) * (np.exp(time * (beta-gamma)) -1 ) + 1

noisesd = 0
data += np.random.normal(0,noisesd,size=(days,))

with pm.Model() as basic_model:

    # Priors for unknown model parameters
    beta = pm.Normal("beta", mu=.2, sigma=1)
    gamma = pm.Normal("gamma", mu=0.1, sigma=1)

    # Expected value of outcome
    mu = pm.Deterministic('mu', beta / (beta-gamma)*(np.exp(time*(beta-gamma))-1)+1)

    # Likelihood (sampling distribution) of observations
    Y_obs = pm.Normal("Y_obs", mu=mu, sigma=100, observed=data)
    trace = pm.sample(init='adapt_diag', tune=5000)
    print(trace)
    az.plot_trace(trace)
    plt.show()
    az.plot_posterior(trace, var_names=['beta', 'gamma'])
    plt.show()
    az.plot_pair(trace, marginals=True, var_names=['beta', 'gamma'])
    plt.show()

codeman · July 2, 2021, 3:54am

Thank you so much for all your help. I really appreciate it. Everything seems to be running fine now.

Topic		Replies	Views
A little change to the sample code of MAP, however Questions	10	1118	April 1, 2018
Getting wildly inaccurate MAP estimates in a Poisson Model version agnostic modeling	1	302	November 22, 2022
Common reasons for getting a map estimate that is far from the mode of the posterior v5	9	1212	April 29, 2023
Frequently Asked Questions Questions	12	25039	June 30, 2023
MAP not working with shape>1 and VI on a pooled model Questions	11	815	November 25, 2017

Simple exponential model fails to find map_estimate

True parameter values

Size of dataset

Predictor variable

Simulate outcome variable

Related topics