 # Is there any material gently explain how PYMC3 class/function works

Dear pymc3ers:

I am new to Bayesian Statistics and pymc3. I follow tutorials to build models, but want to understand more about how pm.sample() do. Let’s use a simple example. The following model is from official tutrial:

``````basic_model = pm.Model()

with basic_model:

# Priors for unknown model parameters
alpha = pm.Normal('alpha', mu=0, sigma=10)
beta = pm.Normal('beta', mu=0, sigma=10, shape=2)
sigma = pm.HalfNormal('sigma', sigma=1)

# Expected value of outcome
mu = alpha + beta*X1 + beta*X2

# Likelihood (sampling distribution) of observations
Y_obs = pm.Normal('Y_obs', mu=mu, sigma=sigma, observed=Y)

with basic_model:
# draw 500 posterior samples
trace = pm.sample(500)
``````

Is there an explanation how pm.sample(500) work step by step to actually do sampling the posterior distribution defined by basic_model? When and how the priors and likelihood are used in the sampling? What is the difference between pm.Normal() with observed=Y argument and that without such argument? How and when to use them?

Thank you very much.

Many of your question is related to concepts in Bayesian Statistics, I suggest you to follow some book and get some idea of them first. Statistical Rethinking is a good one to start. Otherwise we have a lot of resource in https://docs.pymc.io/learn.html.

As for code logic, you can have a look at the dev guide https://docs.pymc.io/developer_guide.html (but I suggest you to get conformable to Bayesian Concepts first).

Is there an explanation how pm.sample(500) work step by step to actually do sampling the posterior distribution defined by basic_model?

See https://docs.pymc.io/developer_guide.html#logp-and-dlogp. Basically, `pymc3.Model` computes everything in the `with` context and generate a logp_dlogp function, and then a sampler will use that logp_dlogp function to do sampling

When and how the priors and likelihood are used in the sampling?

They are combined into a single model likelihood function, see my talk @ PyData Berlin: https://github.com/junpenglao/All-that-likelihood-with-PyMC3

What is the difference between pm.Normal() with observed=Y argument and that without such argument? How and when to use them?

Without `observed=Y` it is a free parameter that you want to infer it’s posterior distribution. In principle you use it when you want to associate your data (i.e., observation) to some distribution.

2 Likes

Thanks for your response. I would like to read the book and codes and get back to you shortly.