Question of sampling result

Jym0502 · September 13, 2020, 10:44am

I am new to learning MCMC and I have a nonlinear model like this:

Takano_model_1 = pm.Model()
with Takano_model_1:
## Priors
A_f = pm.Bound(pm.Normal, lower=0.0)(“A_f”,mu = 3.8671, sigma = 3.867110)
A_r = pm.Bound(pm.Normal, lower=0.0)(“A_r”,mu = 1.1635, sigma = 1.163510)
A_CO2 = pm.Bound(pm.Normal, lower=0.0)(“A_CO2”,mu = 2.5027, sigma = 2.502710)
A_H2O = pm.Bound(pm.Normal, lower=0.0)(“A_H2O”,mu = 5.5128, sigma = 5.512810)
### Activation energy (unit:J/mol)
Ea_f = pm.Bound(pm.Normal, lower=0.0)(“Ea_f”,mu = 2.2768, sigma = 2.276810)
Ea_r = pm.Bound(pm.Normal, lower=0.0)(“Ea_r”,mu = 1.1445, sigma = 1.144510)
Ea_CO2 = pm.Bound(pm.Normal, upper=0.0)(“Ea_CO2”,mu = -3.2333, sigma = 3.233310)
Ea_H2O = pm.Bound(pm.Normal, lower=0.0)(“Ea_H2O”,mu = 7.7608, sigma = 7.760810)
sigma = pm.HalfNormal(‘sigma’,sd = 0.0003)
k_f = A_f * 1.0E02 * (math.e ** (- Ea_f * 1.0E04 / R / T_Bottom))
k_r = A_r * 1.0E08 * (math.e ** (- Ea_r * 1.0E05 / R / T_Bottom))
K_CO2 = A_CO2 * 1.0E-05 * (math.e ** (- Ea_CO2 * 1.0E04 / R / T_Bottom))
K_H2O = A_H2O * 1.0E07 * (math.e ** (- Ea_H2O * 1.0E04 / R / T_Bottom))
r_CH4_obs = k_f * (K_CO2 * p_CO2_in * p_H2_in ** 0.5) / ((1 + K_CO2 * p_CO2_in) ** 2) - k_r * (K_H2O * p_CH4_in ** 2 * p_H2O_in) / ((1 + K_H2O ** 2 * p_H2O_in) ** 2)

##Likehood function
r_CH4_likehood = pm.Normal('r_CH4_likehood',mu = r_CH4_obs, sd = sigma, observed = r0_CH4)

Due to the conditions of the chemical equation, bound is applied to the prior distribution.
But I’m confused about the posterior traceplot:

It seems that the resulting distribution around 0 is odd. The posterior should have been a Normal Distribution, but A_CO2 and A_H2O look like Halfnormal distribution. And it seems that the distribution of Ea_r is started from 0, many sampling results distributed at 0, it looks like that the samlping less than zero is cut.
So my question is whether the function of bound is to directly remove the result below 0?
How can I explain the result of Ea_r, which is odd around zero?
And Is the sampling result of this MCMC reliable?

Thanks!

ar-valdez · September 18, 2020, 12:45am

So let’s go by parts MCMC and PyMC3 is a smart sampler that uses your parameter priors to build Markov chains that contain your posterior distributions for each parameter.

This being said, you can always update your priors for example for Ea_r you can use in a second step something like Uniform(3,7) instead of your initial prior… Why I am saying this, for you to force the sampler to draw samples in a reduced region.

In order to check if the mcmc samples have converged you can always plot the autocorrelation (good case is near 0) something like this:

pm.plot_autocorr(trace)

Hope my advice helps you…

cluhmann · September 18, 2020, 3:34am

My understanding of how pymc3 constructs bounded variables is that it simply prevents the sampler/chain from ever exploring outside the bounding interval. In your example, you have placed lower bounds of 0.0 on both A_CO2 and A_H2O. However, it’s pretty clear that the model and the data are encouraging the sampler to head toward zero (the marginal posteriors for both of these variables are stacked up right at 0.0). Once the sampler gets there, it bumps up against the bound you have provided and cannot explore smaller values. The conflict between the model/data/likelihood on the one hand and the bound on the other hand is likely what is causing the divergences (the tick marks on the x-axis) you see near A_CO2=0 and A_H2O=0.

So to answer a couple of your questions directly:

…bound is applied to the prior distribution.

Bounds are applied to variables, not just the priors associated with those variables.

…is the function of bound is to directly remove the result below 0?

No. The function is the prevent the sampler from every exploring values that are less than zero. The chains don’t sample negative values that later get “removed”. Negative values are never sampled in the first place.

How can I explain the result of Ea_r, which is odd around zero?

Given the difficulties associated with A_CO2 and A_H2O, the entire posterior interpretation of the entire posterior is probably a bit suspect. I would focus dealing with A_CO2 and A_H2O, their bounds, and the divergences they seem to be producing before worrying about the other variables.

Topic		Replies	Views
Bound variables misbehaving Questions	1	486	December 10, 2020
Prior boundaries that are not constant but random variables Questions	6	877	December 14, 2017
Truncated/Bounded Multivariate Normal? Questions	6	2717	March 13, 2019
PyMC3-Sampling not taking place for the entire range of values between the given bounds of a Uniform prior Questions	7	1588	January 14, 2018
The Normal distribution with upper and lower bound Questions	2	1782	December 14, 2017

Question of sampling result

Related topics