I’m trying to test out some simple imputation of missing observed values with a Bernoulli distribution and hit a theano problem, and was wondering if anyone had any ideas about solving it, or if it’s a theano bug. I’m using PyMC3 version 3.6 and theano version 1.0.3. A simple version of my code is as follows:
import pymc3 as pm from scipy.stats import bernoulli # set "true" probability of rain true_rain = 0.41 # set number of previous "observations" nobs = 1000 # set the observations has_rained = bernoulli.rvs(true_rain, size=nobs) # try subsituting in a miss sample has_rained = -1 # add missing samples as -1 has_rained = np.ma.masked_values(has_rained, value=-1) # create masked array with pm.Model() as model: prain = pm.Uniform('prain', 0.0, 1.0) # prior on probability of rain # distribution of prain given the number of observed times it has rained rain = pm.Bernoulli('rain', p=prain, observed=has_rained) trace = pm.sample(2000, tune=6000, discard_tuned_samples=True, chains=2)
The final lines of the error message that this produces are:
~/.conda/envs/survival/lib/python3.6/site-packages/theano/tensor/type.py in filter_variable(self, other, allow_convert) 232 dict(othertype=other.type, 233 other=other, --> 234 self=self)) 235 236 def value_validity_msg(self, a): TypeError: Cannot convert Type TensorType(int64, vector) (of Variable rain_missing_shared__) into Type TensorType(int64, (True,)). You can try to manually convert rain_missing_shared__ into a TensorType(int64, (True,)).
I can only assume that this is failing due to an issue with the Bernoulli distributions use of integer or boolean types, as this isn’t a problem that is noted in this example.
I also see the same error if trying to pass a theano
shared variable, created from a numpy array of ones and zeros, as observations to a Bernoulli distribution.