Hello there, this is my first question here.

Motivated by the capabilities of PyMC3, I tried to use it as a method to fit a function to a certain data, and also as a method to estimate the uncertainties in the fitting. Before doing this I use to code using lmfit (to fit) and emcee(as mcmc sampler)… I notice that the method `map_estimate = pm.find_MAP(model=basic_model,method='Powell')`

returns the likelihood estimator (i.e the tuple that fits the data)…

As you can see in the next figure I have compared the lmfit best fit and the pm.find_MAP() likelihood fit… I was wondering why the MLE result is so far from the lmfit result… As a comment I used the same ranges of variations for each method… In PyMC3 I used uniform priors in order to compare the solutions.

Eventually I have thought an alternative to bypass this issue. When defining the log-likelihood function, I can use lmfit as obvserved data…`Y_obs = pm.Normal('Y_obs', mu=mu_val, sigma=sigma_val, observed=lmfit_result)`

. My question is how accurate/robust can this be to estimate the uncertainties in the best-fit values???

Thanks.

Hi, and welcome

It seems to me that you’re turning to a Bayesian model because you care about the uncertainties in your estimates. If yes, then my question would be: why aren’t you running MCMC to get full posterior distributions of your parameters, instead of using `find_MAP`

, which will only give you point estimates?

Also, note that initalizing the MCMC sampler at the MAP is discouraged now – just let PyMC initialize the sampler for you.

Hope this helps

Dear @AlexAndorra, thanks for the reply. In my study case I executed PyMC3 as a sampler, I thought that the likelihood tuple obtained from ` find_MAP()`

was obtained from the Markov Chains (With the samples already done)… Apparently its a tuple that does not reflects the overall behavior of the drawn samples…

Naively I started doing this:

`# define the log-likelihood`

`Y_obs = pm.Normal('Y_obs', mu=mu_val, sigma=sigma_val, observed=Experimental_data)`

The MCMC variability did not captured so “good” the experimental data.

The last trick I tried was the following:

`# define the log-likelihood`

`Y_obs = pm.Normal('Y_obs', mu=mu_val, sigma=sigma_val, observed=lmfit for Experimental_data)`

Using this last approach the uncertainties capture very good the experimental data…

My question then should be updated to: Does PyMC3 runs better with a pre-processed data, (for example a least squares approx for the model), instead of the raw data?

Thanks

I would definitely use the raw data instead of an `lm`

fit. I think the problem is how you use the results of the MCMC sampler to summarize your inferred parameters: as you said, `find_MAP`

does not do that – actually, you won’t find any point estimate that reflects the behavior of *all* the posterior.

The solution is to do your computations for summarizing directly on the whole posterior. There are infinite possibilities, all bespoke to your particular model, but a classical way to summarize a regression is to plot the mean (e.g `trace["parameter"].mean(axis=0)`

) and the highest posterior density inverval of your parameters of interest (you can do that with ArviZ’s plot_hdi for instance).

Hope this helps

Dear @AlexAndorra,

Thanks for the explanation. For each Markov Chain I will obtain the statistical moments (as done by `s = pm.summary(trace).round(2)`

)…

With the `hdi`

object I can have an estimate for the confidence intervals of each Markov Chain.

Thanks for the support.

Dear @AlexAndorra,

Let me share a new thought. I have read this demo case. After reading the case I tried the following:

- Run PyMC3 using the
`lmfit`

fitting as observed data, from there I obtain posterior pdfs.
- Run PyMC3 using the experimental data , but the priors should be the ones obtained from (1).

The over all code behaves very good now. Can this grant me the possibility to sample around a good fit and real data at the same time???

Thanks.