Hello,
I am trying to use the Data
class to hold some observed values of a Bernoulli distribution. However, when I do this I get an error at the Theano level that I can’t make any sense of. I’m using pymc3 3.10.0.
I’ve written 2 examples below that should reproduce the error. The first is a basic example and the second is more representative of the underlying use case. I’ve included the additional example because in the first I could just not use the Bernoulli distribution and use the x_data
variable directly. But for the second example, I don’t believe this would work as a possible fix.
I’m pretty sure the error is down the Data
class because if I just use just the data directly the error is not reproduced.
Basic Example:
import scipy.stats as sts
import pymc3 as pm
x = np.random.randint(0, 2, size=200)
y = 2.4 * x
with pm.Model() as model:
# x_data = x
x_data = pm.Data('x_data', x)
y_data = pm.Data('y_data', y)
X = pm.Bernoulli('X', p=0.5, observed=x_data)
a1 = pm.Normal('a1', 0, 5)
Y = pm.Normal('Y', mu=a1 * X, sd=4., observed=y_data)
with model:
trace = pm.sample(2000)
More detailed example:
import scipy.stats as sts
import pymc3 as pm
x1 = sts.norm.rvs(loc=0, scale=1., size=200)
x2 = sts.bernoulli.rvs(p=sts.logistic.cdf(1.8 * x1),
size=200)
y = sts.norm.rvs(loc=2.4 * x2, scale=2.)
with pm.Model() as model:
x1_data = pm.Data('x1_data', x1)
x2_data = pm.Data('x2_data', x2)
# x2_data = x
y_data = pm.Data('y_data', y)
a1 = pm.Normal('a1', 0., 4.)
a2 = pm.Normal('a2', 0., 4.)
X2 = pm.Bernoulli('X',
p=pm.math.sigmoid(a1*x1_data),
observed=x2_data)
Y = pm.Normal('Y', mu=a2 * X2, sd=4., observed=y_data)
with model:
trace = pm.sample(2000)
Portion of Error traceback:
TypeError: expected type_num 5 (NPY_INT32) got 7
Apply node that caused the error: Elemwise{Composite{(i0 + (i1 * sqr((i2 - (i3 * i4)))))}}(TensorConstant{(1,) of -4..5788649127}, TensorConstant{(1,) of -0.0625}, y_data, InplaceDimShuffle{x}.0, X)
Toposort index: 3
Inputs types: [TensorType(float64, (True,)), TensorType(float64, (True,)), TensorType(float64, vector), TensorType(float64, (True,)), TensorType(int32, vector)]
Inputs shapes: [(1,), (1,), (200,), (1,), (200,)]
Inputs strides: [(8,), (8,), (8,), (8,), (8,)]
Inputs values: [array([-4.61046579]), array([-0.0625]), 'not shown', array([0.]), 'not shown']
Outputs clients: [[Sum{acc_dtype=float64}(Elemwise{Composite{(i0 + (i1 * sqr((i2 - (i3 * i4)))))}}.0)]]
HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'.
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.
Any help on this would be awesome! Thanks.