TypeError in comparison when using pm.Data

Hi all,

I have been reading through the documentation, trying to figure out how to correctly use pm.Data so I can pass in data for training. I keep having the following issue.

Code:

with pm.Model() as model:
    Site_prior = pm.Dirichlet("Site_prior", a=np.array([1,1,1,1,1]))
    Site_obs = pm.Data("Site_obs", X_train["Site_Code"]) # X_train["Site_Code"] takes on values 0,1,2,3,4
    Site = pm.Categorical("Site", p=Site_prior, observed=Site_obs)
    v = pm.math.switch(pm.math.eq(Site, 0), 0, 10)
    y = pm.Normal("y", mu=v, sigma=1, observed=y_train)
    pm.sample(10)

Error:

The error when converting the test value to that variable type:
TensorType(int32, vector) cannot store a value of dtype int64 without risking loss of precision. If you do not mind this loss, you can: 1) explicitly cast your data to int32, or 2) set "allow_input_downcast=True" when calling "function".

I do not understand what is causing this, because the code works fine if I pass the data to observed directly instead of using pm.Data. I’m sure I’m making some very basic mistake with this, as I am new to pymc3. If anyone could help, I would be very appreciative!

There’s likely a processing step involved with observed that automatically casts the data or manages its type before the type of error you’ve seen can come up. Can you work around this by doing Site_obs = pm.Data("Site_obs", X_train["Site_Code"].astype('int32')?