From what I understand, this setup is quite typical for a hierarchical model:

```
with pm.Model() as model:
global_param_mu = pm.Normal(‘global_param_mu, 0.0, 1.0)
global_param_sigma = pm.HalfNormal(‘global_param_sigma’, 1.0)
local_param = pm.Normal(‘local_param’, global_param_mu, global_param_sigma)
mu = pm.Deterministic(‘mu’, x*local_param[local_idx])
sigma = pm.HalfNormal(‘sigma’, 1.0)
likelihood = pm.Normal(‘likelihood’, mu, sigma, observed = y)
```

And I’m wondering about `global_param_sigma`

… I discovered that when I drop it and just enter `1.0`

as the value for `local_param`

sigma, my model samples much faster and give the same good result.

Since my data is standardized, I expect everything to be distributed normally with 0.0 mean and 1.0 stdev. So I thought this would be ok.

But still, I would like to here someone else’s thoughts of why or why not to sample `global_param_sigma`

I’m not expert, but:

There’s an issue with your model in that sizes or dimensions are not specified. `local_param`

seems to be indexed by `local_idx`

, but it’s not at all clear how many of them there are.

If your data is standardized, that means that the overall standard deviation (of the observed data `y`

) should be 1.

This standard deviation is calculated on the data and the model says it arises from three different sources: the variance of `y`

around `mu`

(here called `sigma`

), the variability in `mu`

caused by `x`

, and the vaiability in the modulation of the effect of `x`

that comes from variability in `local_param`

(called `global_param_sigma`

).

Together these different sources of variability in `y`

have to sum to be 1, but since variability is positive, they cannot each themselves be 1. In fact, what you are doing by having the two parameters, `sigma`

and `global_param_sigma`

is partitioning the variance of `y`

into three variances as described above. Properly normalized, they will sum to the total variance which is 1.

Note also, that they will not sum to exactly 1. It is only the variance of the standardized data that is 1. The variance of the population is a parameter with a distribution. This distribution can be found by addng up the (properly normalized) (samples of the) components of the variance specified in the model as I discussed above.

I hope this helps. I worry I was not clear.

Opher

1 Like