In the tutorial here http://docs.pymc.io/notebooks/getting_started, under
Getting started Linear Regression, we have
basic_model = pm.Model()
# Priors for unknown model parameters
alpha = pm.Normal('alpha', mu=0, sd=10)
beta = pm.Normal('beta', mu=0, sd=10, shape=2)
sigma = pm.HalfNormal('sigma', sd=1)
# Expected value of outcome
mu = alpha + beta*X1 + beta*X2
# Likelihood (sampling distribution) of observations
Y_obs = pm.Normal('Y_obs', mu=mu, sd=sigma, observed=Y)
We need alpha, beta ( which has 2 values b_0 and b_1, and sigma, now what is the difference between following the above and using something like
beta0 = pm.Normal('beta1', mu=0, sd=10 )
beta1 = pm.Normal('beta2', mu=0, sd=10 )
mu = alpha + betta0*X1 + beta1*X2
In the first case, the 2 values of beta also seem to come from completely different distributions as it seems from the trace-plot. What is the difference between the 2 approaches.
There should be no differences between these two approaches.
I would like to follow up this question to clarify something which has confused me. Does this mean when you specify a Distribution as having Shape 2, then two independent distributions are created with the same prior, which are then trained independently of each other.
I want to make sure that Shape doesn’t simply mean; draw two samples from the same distribution. If you wanted to draw multiple samples from the same distribution in a model what argument would you use? dim=2?
I ask as the documentation says that Shape is used to define the length or shape of the random variable, rather than initiating multiple random variables. I think this could be interpreted as a single random variable which is sampled multiple times.
Thanks for your time.
The shape issue is constantly a mess because we does not distinguish from eg event shape, batch shape, etc. There is more though in https://github.com/pymc-devs/pymc3/pull/2833 explaining what the ideal design should be.
As for you specific question, your understanding is correct (without nitpicking some of the terminology). You dont draw samples from distribution within the model block, as that’s done via the random method of a RandomVariable or a distribution. Intuitively, understand it as random variables that are constrained by some rules (i.e., prior distribution), but not random generation from said rules.
Thanks for responding so soon. According to the terminology in the link, am I right then in equating Shape, as it performs currently, with with the behavior of
Also is there an distribution argument which would be the equivalent of the
atom_shape proposed in the link, which we could use now.
When you say Shape is a mess, does it perform differently depending on the use case. Are there cases when Shape would result in different behavior than that shown in the Model example above.
Nope, currently you need to figure out the shape (param_shape + atom_shape) and input them correctly into your model. This is what I meant mess because there are quite some edge cases and bugs.