How to make out of sample predictions with categorical variables? Implementation question from Rethinking 2 Book

mldl920 · July 22, 2020, 8:25pm

Hi how’s it going? I’m going through Rethinking 2, and was wondering if you can help me finish this code in terms of making out of sample predictions.

Here is my simple example of what I have so far…

data = pd.DataFrame({'var1':np.random.random_sample(12),'cat_var':['a','b']*6,'y':[np.random.randint(5,20) for val in range(12)]})

Screenshot from 2020-07-22 16-27-48

data = pd.get_dummies(data,columns=['cat_var'])
cid = pd.Categorical(data["cat_var"])

with pm.Model() as m_5_1:
    a = pm.Normal("a", 1, 0.1, shape=cid.categories.size)
    b = pm.Normal("b", 0, 0.3)
    sigma = pm.Uniform("sigma", 0,1)
    mu = pm.Deterministic("mu", a[cid] + b * data['var1'])
    y = pm.Normal(
        "y",mu=mu, sigma=sigma, observed=y.values
    )
    trace = pm.sample()

Now if I were to just make out of sample predictions without the categorical variable, I would do:

new_data_0 = xr.DataArray(
    newdata['var1'],
    dims=["pred_id"]
)

pred_mean = (
    trace["a"][:newdata.shape[0]] +
    trace["b"][:newdata.shape[0]] * new_data_0 
)

rng = np.random.default_rng()
predictions = xr.apply_ufunc(lambda mu, sd: rng.normal(mu, sd), pred_mean, trace["sigma"][:newdata.shape[0]])

How would I add the categorical variable from new unseen data into this xr datatype format for predictions? In this case 0 for the letter “a”" and 1 for the letter “b”.

Thank you!

AlexAndorra · July 23, 2020, 9:10am

Hi,
I think you can use a combination of pm.Data and pm.sample to do that – more PyMC-idiomatic.
The repo of the port of Rethinking_2 to PyMC should have all the help you need
PyMCheers

Topic		Replies	Views
Predicting with Categorical Questions	5	2001	October 29, 2019
Predictions for out of sample indexed categorical variables Questions	0	259	November 14, 2022
pm.Categorical for matrix of probabilities Questions	3	573	February 22, 2019
Compute predictions for multinomial categorical model modeling	0	316	May 16, 2023
pm.Categorical with sample_numpyro_nuts v5 jax	3	367	November 24, 2023

How to make out of sample predictions with categorical variables? Implementation question from Rethinking 2 Book

Related topics