Shape error when sampling more than one value

Hi! I am very new to PyMC and probabilistic programming in general, so I wonder if the problem I am having is with me or with PyMC. I have the following model implemented with PyMC 3.5 (distilled from a larger example):

import pymc3 as pm
import numpy as np

with pm.Model() as tmpm:
    pri = pm.Dirichlet('pri', a=np.array([10.5, 5.5, 0.5]), shape=(3,))
    tmp = pm.Multinomial('tmp', n=100, p=pri, shape=(3,))
    b = pm.Beta('b', tmp[0], tmp[1])

I can now do b.random(size=1), and that seems to be working fine, sampling a single value of reasonable magnitude. However, if I try b.random(size=10), I get:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-66-a308db443d2f> in <module>()
      4     b = pm.Beta('b', tmp[0], tmp[1])
      5 
----> 6 b.random(size=10)

/data/01/dm/conda_envs/spark_py35/lib/python3.5/site-packages/pymc3/model.py in __call__(self, *args, **kwargs)
     40 
     41     def __call__(self, *args, **kwargs):
---> 42         return getattr(self.obj, self.method_name)(*args, **kwargs)
     43 
     44 

/data/01/dm/conda_envs/spark_py35/lib/python3.5/site-packages/pymc3/distributions/continuous.py in random(self, point, size)
   1073         return generate_samples(stats.beta.rvs, alpha, beta,
   1074                                 dist_shape=self.shape,
-> 1075                                 size=size)
   1076 
   1077     def logp(self, value):

/data/01/dm/conda_envs/spark_py35/lib/python3.5/site-packages/pymc3/distributions/distribution.py in generate_samples(generator, *args, **kwargs)
    522             dist_shape: {dist_shape}
    523             broadcast_shape: {broadcast_shape}
--> 524         '''.format(size=size, dist_shape=dist_shape, broadcast_shape=broadcast_shape))
    525 
    526     # reshape samples here

TypeError: Attempted to generate values with incompatible shapes:
            size: 10
            dist_shape: ()
            broadcast_shape: (3,)

Am I doing something wrong, or is it an issue with PyMC?

Could you try with b = pm.Beta('b', tmp[..., 0], tmp[..., 1]) instead?

The reason I ask is that tmp[0] is a simbolic expression that says get the first item on the first dimension. Inside b.random, the first thing that gets done is to draw a value for the distribution parameters. In your case, the parameters are simbolic expressions that depend on tmp, so it also gets called in draw_values. This call should return a (10, 3) array, so tmp[0] and tmp[1] would end up being (3,) arrays. I’m confused as to why it works when you pass size=1, because there tmp should end up as a (1, 3) array. Now I’m not with my computer so I can’t look deeper into the issue but if the ellipsis doesn’t work, I’ll check more tomorrow.

Or something like b = pm.Beta('b', tmp[0], tmp[1], shape=10)

1 Like

Thank you for the responses! Adding the ellipses did not help, unfortunately – still the same error. As for specifying the shape in Beta, b = pm.Beta('b', tmp[0], tmp[1], shape=10) still errors out on b.random(size=10), this time with:

TypeError: Attempted to generate values with incompatible shapes:
            size: 10
            dist_shape: (10,)
            broadcast_shape: (3,)

Setting shape=3 allows it to run without error, but the result is a (10, 3) array, and I am not sure how to interpret the sampled values. I would like to produce a vector of 10 values sampled from the beta distribution, whose parameters are counts of specific buckets sampled from the multinomial distribution.

I also noticed that tmp.random(size=10) returns a (3, 10) array, as I expected, so it seems tmp[0] should refer to the vector of 10 samples corresponding to the first bucket, and passing it as the first parameter of Beta is what I intended.

I think what @junpenglao meant is that if you specify the shape parameter on b of 10, then b.random() (without a size specified) will yield 10 samples of b, e.g.

with pm.Model() as tmpm:
    pri = pm.Dirichlet('pri', a=np.array([10.5, 5.5, 0.5]), shape=(3,))
    tmp = pm.Multinomial('tmp', n=100, p=pri, shape=(3,))
    
    b = pm.Beta('b', tmp[0], tmp[1], shape=10)
b.random()
array([0.70177365, 0.61298271, 0.71219658, 0.6304795 , 0.64659485,
       0.72043102, 0.62172508, 0.68043971, 0.66485797, 0.65243237])
2 Likes

tmp.random(size=10) should return a (10, 3) array. If you are getting it transposed, there is something deeper going wrong. I’ll look into it tonight.

Ok. So I checked with a version before and after merging this PR (which restructured Multinomial._random) into master. The older implementation of Multinomial._random had the problem that tmp.random(size=10).shape == (3, 10) instead of (10, 3), leading to all the ackwardness that you ran into. The PR #3285 luckily fixes that and makes your example code work, making tmp.random(size=10).shape == (10, 3) and b.random(size=10).shape == (10,).

To fix your problem you should just update to the latest release 3.6.rc1 or checkout the bleeding edge master branch from GitHub.

3 Likes

Thanks to everyone who responded. Both solutions (specifying the shape in Beta and upgrading to 3.6.rc1) solve this problem!

2 Likes

I have a further question regarding the two solutions. Are they supposed to produce equivalent results? It seems to me that specifying the shape parameter results in much higher variance of the samples between calls to b.random(). Here is a small experiment:

with pm.Model() as tmpm1:
    pri = pm.Dirichlet('pri', a=np.array([10.5, 5.5, 0.5]), shape=(3,))
    tmp = pm.Multinomial('tmp', n=100, p=pri, shape=(3,))
    b1 = pm.Beta('b1', tmp[0], tmp[1], shape=10000)
    b2 = pm.Beta('b2', tmp[0], tmp[1])

I now sample from b1 and b2 ten times each, calculate mean and stderr for each sample:

b1_stats = [[np.mean(ss), np.std(ss)] 
            for _ in range(10)
            for ss in [b1.random(size=1)]]
b2_stats = [[np.mean(ss), np.std(ss)] 
            for _ in range(10) 
            for ss in [b2.random(size=10000)] ]

and then see how dispersed these sample statistics are:

np.mean(b1_stats, axis=0), np.std(b1_stats, axis=0)
(array([ 0.64100557,  0.04667407]), array([ 0.10264955,  0.00359506]))
np.mean(b2_stats, axis=0), np.std(b2_stats, axis=0)
(array([ 0.65651382,  0.13297402]), array([ 0.00102267,  0.00090269]))

– the standard error of the means is two orders of magnitude higher when specifying the shape than when specifying the size of the random sample. I suppose this might be an artifact of just running 10 iterations, or having the sample size in each iteration not being high enough, but when I tried to increase either I ran into the following problem:

b2.random(size=100000)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-58-6bb6d8da33b5> in <module>()
----> 1 b2.random(size=100000)

~/anaconda3/envs/pymc3/lib/python3.5/site-packages/pymc3/model.py in __call__(self, *args, **kwargs)
     41 
     42     def __call__(self, *args, **kwargs):
---> 43         return getattr(self.obj, self.method_name)(*args, **kwargs)
     44 
     45 

~/anaconda3/envs/pymc3/lib/python3.5/site-packages/pymc3/distributions/continuous.py in random(self, point, size)
   1161         return generate_samples(stats.beta.rvs, alpha, beta,
   1162                                 dist_shape=self.shape,
-> 1163                                 size=size)
   1164 
   1165     def logp(self, value):

~/anaconda3/envs/pymc3/lib/python3.5/site-packages/pymc3/distributions/distribution.py in generate_samples(generator, *args, **kwargs)
    594     elif broadcast_shape[-len(dist_shape):] == dist_shape or len(dist_shape) == 0:
    595         if size == 1 or (broadcast_shape == size_tup + dist_shape):
--> 596             samples = generator(size=broadcast_shape, *args, **kwargs)
    597         elif dist_shape == broadcast_shape:
    598             samples = generator(size=size_tup + dist_shape, *args, **kwargs)

~/anaconda3/envs/pymc3/lib/python3.5/site-packages/scipy/stats/_distn_infrastructure.py in rvs(self, *args, **kwds)
    938         cond = logical_and(self._argcheck(*args), (scale >= 0))
    939         if not np.all(cond):
--> 940             raise ValueError("Domain error in arguments.")
    941 
    942         if np.all(scale == 0):

ValueError: Domain error in arguments.

You are getting the domain error because sometimes the multinomial gets a zero in the first or second category, and neither of the beta distribution’s parameters can be lower our equal than zero.

1 Like

I’m confused about the difference you point out. I’ll look a bit deeper into it’s origin

Upon further thought, the main difference between the two is that when you specify the shape, it is used as if it were a batch shape. What I mean is that all the 10000 values will be drawn using exactly the same single pair of parameter values.

Whereas in the second case, an independent pair of alpha and beta parameter values are drawn for each b2 value that is drawn.

I think that what you need is more in line with the second method than with the first

1 Like

Thanks again, this makes sense! I suppose this also explains why the mean of standard error was higher for the variable beta parameters approach but the standard error of the mean was higher for the fixed beta parameters approach.