# Pymc4_using: idata_extend()

idata_extend is very handy for getting predicted values from a model, however in the following simple model it fails in pymc4 ?

``````# Hierachicial
with pm.Model() as model_h:
# Define the higher level distribution for the means
mu_mean = pm.Normal("mu_mean", mu=0, sigma=10)
mu_sigma = pm.HalfNormal("mu_sigma", sigma=10)

# Define the lower level prior distribution for the means
mu = pm.Normal("mu", mu=mu_mean, sigma=mu_sigma, shape=4)

# Define the higher level distribution for the standard deviations
sigma_mean = pm.Normal("sigma_mean", mu=0, sigma=10)
sigma_sigma = pm.HalfNormal("sigma_sigma", sigma=10)

# Define the lower level prior distribution for the standard deviations
sigma = pm.HalfNormal("sigma", sigma=sigma_mean, shape=4)

# Define the likelihood distributions for the samples
ν = pm.Exponential('ν', 1/30)

# Define the likelihood for the first group of samples
y1 = pm.StudentT("y1", mu=mu[0], sigma=sigma[0], nu=ν, observed=df['tip'][df['day']=='Thur'].values)

# Define the likelihood for the second group of samples
y2 = pm.StudentT("y2", mu=mu[1], sigma=sigma[1], nu=ν, observed=df['tip'][df['day']=='Fri'].values)

# Define the likelihood for the third group of samples
y3 = pm.StudentT("y3", mu=mu[2], sigma=sigma[2], nu=ν, observed=df['tip'][df['day']=='Sat'].values)

# Define the likelihood for the fourth group of samples
y4 = pm.StudentT("y4", mu=mu[3], sigma=sigma[3], nu=ν, observed=df['tip'][df['day']=='Sun'].values)

# Define the difference of means as a deterministic variable
diff_1_0 = pm.Deterministic('mu2-mu1', mu[1] - mu[0])
diff_2_0 = pm.Deterministic('mu3-mu1', mu[2] - mu[0])
diff_3_0 = pm.Deterministic('mu4-mu1', mu[3] - mu[0])

# Fit the model and draw samples
idata_h= pm.sample(1000,chains=2)
with model_h:
idata_h.extend(pm.sample_prior_predictive())
idata_h.extend(pm.sample_posterior_predictive(idata_h))
``````

Any ideas on why :

``````idata_h.extend(pm.sample_prior_predictive())
idata_h.extend(pm.sample_posterior_predictive(idata_h))
``````

fails? and how to correctly use them with this model to obtain predictives? . Hope you can help.

Hi, could you share the complete error traceback you get? The pymc and arviz versions you are using might also be helpful.

You can use three backticks to format code and backticks

`````````
Full traceback here as text
```
``````

We also have other useful tricks for maximising the chances we’ll be able to answer your question at Asking Questions on the PyMC Discourse — PyMC project website

The PYMC and Arviz versions I am running are:

• Runing on PyMC v4.4.0
• Runing on Arviz v0.14.0

The full error Message is:

``````ValueError                                Traceback (most recent call last)
413                 ):
--> 414                     thunk()
415                     for old_s in old_storage:

~\anaconda_new\lib\site-packages\aesara\graph\op.py in rval(p, i, o, n, params)
542             ):
--> 543                 r = p(n, [x[0] for x in i], o)
544                 for o in node.outputs:

~\anaconda_new\lib\site-packages\aesara\tensor\random\op.py in perform(self, node, inputs, outputs)
367
--> 368         smpl_val = self.rng_fn(rng, *(args + [size]))
369

~\anaconda_new\lib\site-packages\aesara\tensor\random\basic.py in rng_fn(cls, *args, **kwargs)
54         size = args[-1]
---> 55         res = cls.rng_fn_scipy(*args, **kwargs)
56

~\anaconda_new\lib\site-packages\aesara\tensor\random\basic.py in rng_fn_scipy(cls, rng, loc, scale, size)
336         """
--> 337         return stats.halfnorm.rvs(loc, scale, random_state=rng, size=size)
338

~\anaconda_new\lib\site-packages\scipy\stats\_distn_infrastructure.py in rvs(self, *args, **kwds)
1065                        "documentation for details.")
-> 1066             raise ValueError(message)
1067

ValueError: Domain error in arguments. The `scale` parameter must be positive for all distributions, and many distributions have restrictions on shape parameters. Please see the `scipy.stats.halfnorm` documentation for details.

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_30492\2263873921.py in <module>
1 with model_h:
----> 2     idata_h.extend(pm.sample_prior_predictive())
3     idata_h.extend(pm.sample_posterior_predictive(idata_h))

~\anaconda_new\lib\site-packages\pymc\sampling\forward.py in sample_prior_predictive(samples, model, var_names, random_seed, return_inferencedata, idata_kwargs, compile_kwargs)
422     # All model variables have a name, but mypy does not know this
423     _log.info(f"Sampling: {list(sorted(volatile_basic_rvs, key=lambda var: var.name))}")  # type: ignore
--> 424     values = zip(*(sampler_fn() for i in range(samples)))
425
426     data = {k: np.stack(v) for k, v in zip(names, values)}

~\anaconda_new\lib\site-packages\pymc\sampling\forward.py in <genexpr>(.0)
422     # All model variables have a name, but mypy does not know this
423     _log.info(f"Sampling: {list(sorted(volatile_basic_rvs, key=lambda var: var.name))}")  # type: ignore
--> 424     values = zip(*(sampler_fn() for i in range(samples)))
425
426     data = {k: np.stack(v) for k, v in zip(names, values)}

~\anaconda_new\lib\site-packages\aesara\compile\function\types.py in __call__(self, *args, **kwargs)
969         try:
970             outputs = (
--> 971                 self.vm()
972                 if output_subset is None
973                 else self.vm(output_subset=output_subset)

416                         old_s[0] = None
417             except Exception:
--> 418                 raise_with_op(self.fgraph, node, thunk)
419

~\anaconda_new\lib\site-packages\aesara\link\utils.py in raise_with_op(fgraph, node, thunk, exc_info, storage_map)
532         # Some exception need extra parameter in inputs. So forget the
533         # extra long error message in that case.
--> 534     raise exc_value.with_traceback(exc_trace)
535
536

412                     self.thunks, self.nodes, self.post_thunk_clear, fillvalue=()
413                 ):
--> 414                     thunk()
415                     for old_s in old_storage:
416                         old_s[0] = None

~\anaconda_new\lib\site-packages\aesara\graph\op.py in rval(p, i, o, n, params)
541                 p=p, i=node_input_storage, o=node_output_storage, n=node, params=None
542             ):
--> 543                 r = p(n, [x[0] for x in i], o)
544                 for o in node.outputs:
545                     compute_map[o][0] = True

~\anaconda_new\lib\site-packages\aesara\tensor\random\op.py in perform(self, node, inputs, outputs)
366         rng_var_out[0] = rng
367
--> 368         smpl_val = self.rng_fn(rng, *(args + [size]))
369
370         if (

~\anaconda_new\lib\site-packages\aesara\tensor\random\basic.py in rng_fn(cls, *args, **kwargs)
53     def rng_fn(cls, *args, **kwargs):
54         size = args[-1]
---> 55         res = cls.rng_fn_scipy(*args, **kwargs)
56
57         if np.ndim(res) == 0:

~\anaconda_new\lib\site-packages\aesara\tensor\random\basic.py in rng_fn_scipy(cls, rng, loc, scale, size)
335
336         """
--> 337         return stats.halfnorm.rvs(loc, scale, random_state=rng, size=size)
338
339

~\anaconda_new\lib\site-packages\scipy\stats\_distn_infrastructure.py in rvs(self, *args, **kwds)
1064                        f"Please see the `scipy.stats.{self.name}` "
1065                        "documentation for details.")
-> 1066             raise ValueError(message)
1067
1068         if np.all(scale == 0):

ValueError: Domain error in arguments. The `scale` parameter must be positive for all distributions, and many distributions have restrictions on shape parameters. Please see the `scipy.stats.halfnorm` documentation for details.
Apply node that caused the error: halfnormal_rv{0, (0, 0), floatX, True}(RandomGeneratorSharedVariable(<Generator(PCG64) at 0x23FA90F5900>), TensorConstant{(1,) of 4}, TensorConstant{11}, TensorConstant{0.0}, sigma_mean)
Toposort index: 11
Inputs types: [RandomGeneratorType, TensorType(int64, (1,)), TensorType(int64, ()), TensorType(float32, ()), TensorType(float64, ())]
Inputs shapes: ['No shapes', (1,), (), (), ()]
Inputs strides: ['No strides', (8,), (), (), ()]
Inputs values: [Generator(PCG64) at 0x23FA90F5900, array([4], dtype=int64), array(11, dtype=int64), array(0., dtype=float32), array(-14.39499803)]
Outputs clients: [['output'], ['output', Subtensor{uint8}(sigma, ScalarConstant{3}), Subtensor{uint8}(sigma, ScalarConstant{2}), Subtensor{uint8}(sigma, ScalarConstant{1}), Subtensor{uint8}(sigma, ScalarConstant{0})]]

HINT: Re-running with most Aesara optimizations disabled could provide a back-trace showing when this node was created. This can be done by setting the Aesara flag 'optimizer=fast_compile'. If that does not work, Aesara optimizations can be disabled with 'optimizer=None'.
HINT: Use the Aesara flag `exception_verbosity=high` for a debug print-out and storage map footprint of this Apply node.
``````

Hope, that the above helps to the resolve the problem. Thank you for devoting you time and expertise to resolving this newbie coding problem. Declan

The error is not happening with `idata.extend` but in `pm.sample_prior_predictive`. You can see this in these lines of the traceback:

``````ValueError                                Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_30492\2263873921.py in
1 with model_h:
----> 2     idata_h.extend(pm.sample_prior_predictive())
3     idata_h.extend(pm.sample_posterior_predictive(idata_h))

~\anaconda_new\lib\site-packages\pymc\sampling\forward.py in sample_prior_predictive(samples, model, var_names, random_seed, return_inferencedata, idata_kwargs, compile_kwargs)
422     # All model variables have a name, but mypy does not know this
423     _log.info(f"Sampling: {list(sorted(volatile_basic_rvs, key=lambda var: var.name))}")  # type: ignore
--> 424     values = zip(*(sampler_fn() for i in range(samples)))
425
426     data = {k: np.stack(v) for k, v in zip(names, values)}``````

The reason for the error is that the HalfNormal distribution is getting unvalid parameters:

``````ValueError: Domain error in arguments. The `scale` parameter must be positive for all distributions, and many distributions have restrictions on shape parameters. Please see the `scipy.stats.halfnorm` documentation for details.
Apply node that caused the error: halfnormal_rv{0, (0, 0), floatX, True}(RandomGeneratorSharedVariable(), TensorConstant{(1,) of 4}, TensorConstant{11}, TensorConstant{0.0}, sigma_mean)``````

Seeing that I reviewed the model and `sigma_mean` use (scroll all the way to the right of the 2nd traceback exerpt) and saw

You are using `sigma_mean` which can be positive as the `sigma` parameter of the `HalfNormal("sigma"` distribution, so the model gets negative values to be used as sigma which is not possible and therefore errors out. Moreover, the `sigma_sigma` variable is not being used currently. You should update so that `sigma_sigma` is used as `sigma` in the HalfNormal distribution and remove the `sigma_mean` variable as it is not needed, `mu` and `sigma` have different distributions as priors, so they need different hyper parameters too

Guided by your comments, I have reconstructed the Hyperpriors and priors as follows and It now appears to work and also the new priors I hope make sense. Many thanks again for your help. the new hyperpriors for the hierachial are:

# hyper_priors

``````μ_mu = pm.Normal('μ_mu', mu=0, sigma=10)
μ_sd = pm.HalfNormal('μ_sd', 10)

# priors
mu = pm.Normal('μ', mu=μ_mu, sigma=μ_sd, shape=4)
sigma = pm.HalfNormal('σ', sigma=10, shape=4)
``````