Hello PyMC3 community,
First, thanks for all of the support. I’ve learned a lot about bayesian modeling from reading through these posts and pymc3 docs and examples.
I’ll try to describe my problem as briefly as possible.
I’m modeling a set of differential equations that time-dependent forcing conditions and several (~5-10) time-invariant model parameters. I want to infer the likelihood of the parameter values by comparing the model solution to real-world data. I assume that the errors between the model solutions and the data are gaussian and independent for each timestep.
The forward model is written in fortran imported as a module with f2py. Therefore I’m using the pm.Potential class (following examples from the “black box” model approach) with a least squares likelihood function.
I have 7 model parameters in addition to the error variance term that I attempt to infer. I’ve experimented using both uninformative, uniform prior distributions and normal distributions that reflect reasonable ranges of the parameter values. I’m using the metropolis sampler with 14 chains.
I’m finding that one of the parameters, “pm” seems to get “stuck” towards the lower range of whatever uniform prior that I assign. If I manually adjust the parameter to a higher value I find that the model will have a better rmse.
Here is the sample traceplot:
And here are two plots to illustrate further.
-
The first shows the model solutions (from the traceplot shown above) for the posterior trace compared with the observed data (red). The “pm” parameter has a prior uniform range between .4 and 1.6.
-
Same as the first, but from another range where the prior on pm is set between .8 and 1.2:
You can see that #2 does a much better job matching the data. The fact that the posterior is so close to the lower limit of the prior seems odd to me. It seems like sampler should be able to “find” this region of better fit.
Does anyone have an idea why I might see this behavior? Are there any model adjustments that I would try making? Any pointers of any kind are appreciated!
It’s worth mentioning that the the ‘number of effective’ samples is also low (<10-20%) for some of these experiments.
And this is the code and hyperparameters (in yml format) that I’m using. I can post a reproducible example but you would need to compile a fortran module…
def sls_like(func, parameters, data):
# run the forward model with the new parameters
model = func(parameters[:-1])['qchan']
sigma = parameters[-1]
warm_up = 0
model = model[warm_up:]
data = data[warm_up:]
return -np.sum(0.5 * np.log(2 * np.pi * sigma ** 2) + ((data - model) / sigma) ** 2)
class likelihood_evaluate(tt.Op):
itypes = [tt.dvector] # expects a vector of parameter values when called
otypes = [tt.dscalar] # outputs a single scalar value (the log likelihood)
def __init__(self, func, loglike, obs_data):
# add inputs as class attributes
self.func = func
self.likelihood = loglike
self.obs_data = obs_data
def perform(self, node, inputs, outputs):
# the method that is used when calling the Op
parameters, = inputs # this will contain my variables
# call the log-likelihood function
logl = self.likelihood(self.func, parameters, self.obs_data)
outputs[0][0] = np.array(logl)
def main(func,
dict_of_parameters,
obs_q,
tracename="trace.pkl",
priorname="prior.pkl"):
# create the logliklihood function
logl = likelihood_evaluate(func, sls_like, obs_q)
#logl = likelihood_evaluate(func, wls_like, obs_q)
with pm.Model() as model:
# This list collects the model paramters that are passed in by "dict of parameters"
# this is just a convenience
list_of_parameters = []
for key in dict_of_parameters.keys():
p0 = dict_of_parameters.get(key)
if p0[0] == "normal":
p1 = pm.Normal(key, mu=p0[1], sigma=p0[2])
if p0[0] == "uniform":
p1 = pm.Uniform(key, lower=p0[1], upper=p0[2])
# if p0[0] == "HalfNormal":
# p1 = pm.Uniform(key, lower=p0[1], upper=p0[2])
# append it to the list ..
list_of_parameters.append(p1)
# Append the "sigma" parameter to the end of the list. this is the sigma
# in the error model
list_of_parameters.append(pm.Uniform("sigma", lower=.1, upper=1.0))
parameters = tt.as_tensor_variable(list_of_parameters)
pmpot = pm.Potential('loglike', logl(parameters))
prior = pm.sample_prior_predictive(samples=7000)
with model:
step1 = pm.Metropolis(scaling=.5)
trace = pm.sample(step = step1, chains=14, tune=500, discard_tuned_samples=True)
posterior_predictive = pm.sample_posterior_predictive(trace, var_names=list(dict_of_parameters.keys()))
with open(tracename, 'wb') as buff:
pickle.dump(trace, buff)
with open('posterior_predictive.pkl', 'wb') as buff:
pickle.dump(posterior_predictive, buff)
with open(priorname, 'wb') as buff:
pickle.dump(prior, buff)
pm.traceplot(trace)
plt.show()
print(pm.summary(trace))
c:
- normal
- 1.0
- .1
ks: - uniform
- 1
- 20.0
ku: - uniform
- 1.0
- 20.0
lowercasen: - normal
- 1.0
- .1
sm1max: - normal
- 700
- 25.0
sm2max: - normal
- 300
- 25.0
pm: - uniform
- .4
- 1.6