How to fix "Domain error" when using posterior predictive sampling?

While I can generate a normal looking traceplot and posterior plot from my model, when I try to run the following:

ppc = pm.sample_ppc(trace, samples=500, model=model)

I get the following error:


    ValueError                                Traceback (most recent call last)
C:\ProgramData\Anaconda3\lib\site-packages\pymc3\distributions\distribution.py in _draw_value(param, point, givens, size)
    397                 try:
--> 398                     dist_tmp.random(point=point, size=size)
    399                 except (ValueError, TypeError):

C:\ProgramData\Anaconda3\lib\site-packages\pymc3\distributions\continuous.py in random(self, point, size)
   1074                                 dist_shape=self.shape,
-> 1075                                 size=size)
   1076 

C:\ProgramData\Anaconda3\lib\site-packages\pymc3\distributions\distribution.py in generate_samples(generator, *args, **kwargs)
    502         if size == 1 or (broadcast_shape == size_tup + dist_shape):
--> 503             samples = generator(size=broadcast_shape, *args, **kwargs)
    504         elif dist_shape == broadcast_shape:

C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats\_distn_infrastructure.py in rvs(self, *args, **kwds)
    939         if not np.all(cond):
--> 940             raise ValueError("Domain error in arguments.")
    941 

ValueError: Domain error in arguments.

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
<ipython-input-45-10a08c3d7ad4> in <module>()
----> 1 ppc = pm.sample_ppc(trace, samples=500, model=model)

C:\ProgramData\Anaconda3\lib\site-packages\pymc3\sampling.py in sample_ppc(trace, samples, model, vars, size, random_seed, progressbar)
   1127     # draw once to inspect the shape
   1128     var_values = list(zip(varnames,
-> 1129                           draw_values(vars, point=model.test_point, size=size)))
   1130     ppc_trace = defaultdict(list)
   1131     for varname, value in var_values:

C:\ProgramData\Anaconda3\lib\site-packages\pymc3\distributions\distribution.py in draw_values(params, point, size)
    319             else:
    320                 try:  # might evaluate in a bad order,
--> 321                     evaluated[param_idx] = _draw_value(param, point=point, givens=givens.values(), size=size)
    322                     if isinstance(param, collections.Hashable) and named_nodes_parents.get(param):
    323                         givens[param.name] = (param, evaluated[param_idx])

C:\ProgramData\Anaconda3\lib\site-packages\pymc3\distributions\distribution.py in _draw_value(param, point, givens, size)
    401                     # with theano.shared inputs
    402                     dist_tmp.shape = np.array([])
--> 403                     val = dist_tmp.random(point=point, size=None)
    404                     dist_tmp.shape = val.shape
    405                 return dist_tmp.random(point=point, size=size)

C:\ProgramData\Anaconda3\lib\site-packages\pymc3\distributions\continuous.py in random(self, point, size)
   1073         return generate_samples(stats.beta.rvs, alpha, beta,
   1074                                 dist_shape=self.shape,
-> 1075                                 size=size)
   1076 
   1077     def logp(self, value):

C:\ProgramData\Anaconda3\lib\site-packages\pymc3\distributions\distribution.py in generate_samples(generator, *args, **kwargs)
    501     elif broadcast_shape[-len(dist_shape):] == dist_shape or len(dist_shape) == 0:
    502         if size == 1 or (broadcast_shape == size_tup + dist_shape):
--> 503             samples = generator(size=broadcast_shape, *args, **kwargs)
    504         elif dist_shape == broadcast_shape:
    505             samples = generator(size=size_tup + dist_shape, *args, **kwargs)

C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats\_distn_infrastructure.py in rvs(self, *args, **kwds)
    938         cond = logical_and(self._argcheck(*args), (scale >= 0))
    939         if not np.all(cond):
--> 940             raise ValueError("Domain error in arguments.")
    941 
    942         if np.all(scale == 0):

ValueError: Domain error in arguments.

I am very new to using PyMC3, so this is probably some simple mistake, but I haven’t been able to find relevant information online.

My model is the following (attempting to use Kruschke’s multiple linear regression with a student’s t hyperprior for a beta-distributed y variable)
model

How do I fix this? Please let me know what other information would be relevant!

1 Like

While the printout of the model is helpful, it would also be nice to see the Python code. Could you add that here as well?

Also, at a first glance, my suspicion is that there is no restraint on the alpha/beta arguments into the beta-distributed response that forces them to be positive as is required for the beta distribution’s parameters.

For example, if the sampling_ppc procedure draws a -1 for beta0 and 0 for beta1, then the resulting input to the beta distribution will be -1 which is invalid and hence the error is triggered.

2 Likes

Thank you for your reply! I’ve stopped getting the error now after making other changes to my model, and I didn’t save a copy of the buggy code. Should I close this question or something similar?

For what it’s worth, I had already clipped mu to be between 0 and 1, and as log(kappa) will always be positive (drawn from an exponential distribution), alpha and beta were also guaranteed to be positive. Otherwise, I believe pm.sample would have not worked either?

There’s no need to close it, though if you do discover what the issue was, it would be good if you added that information here.

Note that zero is also a forbidden value for the beta distribution parameters. If you had clipped the values including zero, then that could have caused your error.

2 Likes