Dealing with 1 missing observation

Jack_Caster · July 27, 2018, 10:10am

I am experimenting with a logistic regression that has missing values in the predictor. As suggested here (and elsewhere) I am masking the missing values with a numpy masked array. In this way I can predict what the values for the missing data were.

Everything works fine, but if there is only 1 missing observation Theano complains:

TypeError: Cannot convert Type TensorType(int64, vector) (of Variable obs_t_minus_1_missing_missing_shared__) into Type TensorType(int64, (True,)). You can try to manually convert obs_t_minus_1_missing_missing_shared__ into a TensorType(int64, (True,)).

If I have, for example, 2 missing observations then everything works fine.
Do you know how I can solve this? I am not really at ease with Theano, unfortunately.
I have set up a notebook here, if you are curious.

falk · July 27, 2018, 12:44pm

the part of the error TensorType(int64, vector) into Type TensorType(int64, (True,)) might hint at some data type related issues. (I sometimes find it tricky to follow the internal numpy array dtypes.)

print(obs_t_minus_1.dtype)
print(obs_t_minus_1_ma.dtype)

both return int64. Shouldn’t they be boolean?

Jack_Caster · July 27, 2018, 5:34pm

mmm… When I have 2 missing data points (and everything works fine) the dtype is integer. I do not think that’s the problem. I feel like the issue is related to the size of the tensor being (1, ) instead of (1, 1). These weird errors happened to me before with numpy. That’s why I tend to use vector with 2 dimensions (with the function atleast_2d). I think the fix needs to be done in the PyMC3 code rather than in my code.

junpenglao · July 27, 2018, 9:04pm

It’s quite likely a pymc3 / theano bug. Could you please file an issue on github.

mattpitkin · January 15, 2019, 10:44pm

Just to note that this is fixed in PyMC3 with this PR. This is not in a release yet though.

Topic		Replies	Views
Problem with imputation of missing data for a Bernoulli distribution Questions theano	4	1689	January 15, 2019
Logistic Regression w/ Missing Data? Questions	7	2866	September 11, 2017
Prediction/setting data fails with multivariate observed Questions theano , bug	2	906	September 21, 2021
Categorical variable with p equal to deterministic that depends on observed categorical Questions	2	758	February 5, 2018
Strange error with Categorical distribution Questions	3	472	August 15, 2018

Dealing with 1 missing observation

Related topics