Using shape parameter vs using specifying it using a different model altogether

In the tutorial here, under Getting started Linear Regression, we have

basic_model = pm.Model()

with basic_model:

    # Priors for unknown model parameters
    alpha = pm.Normal('alpha', mu=0, sd=10)
    beta = pm.Normal('beta', mu=0, sd=10, shape=2)
    sigma = pm.HalfNormal('sigma', sd=1)

    # Expected value of outcome
    mu = alpha + beta[0]*X1 + beta[1]*X2

    # Likelihood (sampling distribution) of observations
    Y_obs = pm.Normal('Y_obs', mu=mu, sd=sigma, observed=Y)

We need alpha, beta ( which has 2 values b_0 and b_1, and sigma, now what is the difference between following the above and using something like

beta0 = pm.Normal('beta1', mu=0, sd=10 )
beta1 = pm.Normal('beta2', mu=0, sd=10 )

mu = alpha + betta0*X1 + beta1*X2

In the first case, the 2 values of beta also seem to come from completely different distributions as it seems from the trace-plot. What is the difference between the 2 approaches.

There should be no differences between these two approaches.

1 Like

I would like to follow up this question to clarify something which has confused me. Does this mean when you specify a Distribution as having Shape 2, then two independent distributions are created with the same prior, which are then trained independently of each other.

I want to make sure that Shape doesn’t simply mean; draw two samples from the same distribution. If you wanted to draw multiple samples from the same distribution in a model what argument would you use? dim=2?

I ask as the documentation says that Shape is used to define the length or shape of the random variable, rather than initiating multiple random variables. I think this could be interpreted as a single random variable which is sampled multiple times.

Thanks for your time.

1 Like

The shape issue is constantly a mess because we does not distinguish from eg event shape, batch shape, etc. There is more though in explaining what the ideal design should be.

As for you specific question, your understanding is correct (without nitpicking some of the terminology). You dont draw samples from distribution within the model block, as that’s done via the random method of a RandomVariable or a distribution. Intuitively, understand it as random variables that are constrained by some rules (i.e., prior distribution), but not random generation from said rules.

Thanks for responding so soon. According to the terminology in the link, am I right then in equating Shape, as it performs currently, with with the behavior of param_shape.

Also is there an distribution argument which would be the equivalent of the atom_shape proposed in the link, which we could use now.

When you say Shape is a mess, does it perform differently depending on the use case. Are there cases when Shape would result in different behavior than that shown in the Model example above.

Nope, currently you need to figure out the shape (param_shape + atom_shape) and input them correctly into your model. This is what I meant mess because there are quite some edge cases and bugs.