I believe this will solve the problem. Thanks!
I have a few more syntax-related questions:
I have used X_train_aug
in the y = ~
line because I need to use very specific columns for model building (see below):
y = alpha[group_idx] + X_train_aug[col1].values*beta1[group_idx] + X_train_aug[col2].values*beta2[group_idx] + error
In the above line of code, I have used X_train_aug[col1].values
and X_train_aug[col2].values
to specify my formula (in reality there are 40+ columns that I write manually). How can I do this with my mutable data object X
? I cant seem to slice mutable data X
by columns.
The example shown in this code uses two parameters (beta1 & beta2) to specify the formula:
beta1 = pm.Normal('beta1', mu_beta, sigma_beta,dims = 'group')
beta2 = pm.Normal('beta2', mu_beta, sigma_beta, dims = 'group')
alpha = pm.Normal('alpha' , mu_alpha , sigma_alpha, dims = 'group')
y = alpha[group_idx] + X_train_aug[col1].values*beta1[group_idx] + X_train_aug[col2].values*beta2[group_idx] + error
In reality, I have 40+ columns so I need to initialize 40+ betas. I know I can use the shape
parameter to initialize all of my priors in a single line of code, however, I also need to use the dims
parameter value as well to tell my model that I have a hierarchy in the data. How can I write my model so that I have all the betas initialized with appropriate prior (Normal) with correct hierarchy dims = 'group'
?
Right now I am doing something like this:
beta1 = pm.Normal('beta1', mu_beta, sigma_beta,dims = 'group')
beta2 = pm.Normal('beta2', mu_beta, sigma_beta, dims = 'group')
beta3 = pm.Normal('beta3', mu_beta, sigma_beta, dims = 'group')
beta4 = pm.Normal('beta4', mu_beta, sigma_beta, dims = 'group')
beta5 = pm.Normal('beta5', mu_beta, sigma_beta, dims = 'group')
beta6 = pm.Normal('beta6', mu_beta, sigma_beta, dims = 'group')
beta7 = pm.Normal('beta7', mu_beta, sigma_beta, dims = 'group')
beta8 = pm.Normal('beta8', mu_beta, sigma_beta, dims = 'group')
beta9 = pm.Normal('beta9', mu_beta, sigma_beta, dims = 'group')
beta10 = pm.Normal('beta10', mu_beta, sigma_beta, dims = 'group')
beta11 = pm.Normal('beta11', mu_beta, sigma_beta, dims = 'group')
beta12 = pm.Normal('beta12', mu_beta, sigma_beta, dims = 'group')
beta13 = pm.Normal('beta13', mu_beta, sigma_beta, dims = 'group')
beta14 = pm.Normal('beta14', mu_beta, sigma_beta, dims = 'group')
beta15 = pm.Normal('beta15', mu_beta, sigma_beta, dims = 'group')
beta16 = pm.Normal('beta16', mu_beta, sigma_beta, dims = 'group')
beta17 = pm.Normal('beta17', mu_beta, sigma_beta, dims = 'group')
beta18 = pm.Normal('beta18', mu_beta, sigma_beta, dims = 'group')
beta19 = pm.Normal('beta19', mu_beta, sigma_beta, dims = 'group')
...
y = alpha[g_id] + X_train_aug['col1'].values*beta1[group_idx] + X_train_aug['col2'].values*beta2[group_idx] + X_train_aug['col3'].values*beta3[group_idx] + ... + X_train_aug['col45'].values*beta45[group_idx] + error
How can I achieve this efficiently?