Masked arrays won’t help here. PyMC3 has support for missing values in observed variables, in this case that would be y. To work with missing values in your predictor X you have to specify the distribution those values. What datatypes do you have in there (continuous or discrete)? Can you post the matrix, or a subset of it?