How to extract the variational posterior expected --matrix-- of regression weights from a categorical model?

ashman · January 19, 2022, 8:44pm

Hi,

I have a model of type bambi.models.Model with Family name categorical and link name softmax. For this model, learned regression weights can be expressed as a matrix of size MxK, where M is the number of covariates and K is the number of categories that were observed in the training data. (With an identifiability constraint imposed, one column of this matrix is fixed to 0, so we could represent this as an Mx(K-1) matrix; however I will ignore this detail for the sake of this discussion.)

I have then used pymc3’s mean field ADVI to fit this model, giving me an object of type pymc3.variational.approximations.MeanField. If I call that object model_fitted_by_ADVI, then I can extract the posterior expected regression weights via model_fitted_by_ADVI.mean.eval(), but those regression weights take the form of a flat vector, so it is unclear how to reconstruct the weight matrix from this vector. I do not know if the entries of the vector represent the matrix unfolded by row, by column, or by some other mechanism.

What is the recommended way to reconstruct the regression weight matrix from the flat vector (or to access it directly)?

Thank you kindly in advance for any guidance that you might be able to share.

tcapretto · January 20, 2022, 12:48am

Hi!

Could you share a snippet that allows reproducing the situation you describe? That would be really helpful so I can test it on my side before trying to give an answer.

Thanks!

ashman · January 20, 2022, 2:25am

Sure thing.


import bambi as bmb 
import pandas as pd 

# Get some data for a categorical regression
url_to_iris_data = "https://raw.githubusercontent.com/uiuc-cse/data-fa14/gh-pages/data/iris.csv"
df = pd.read_csv(url_to_iris_data)
df = df.astype({"species": "category"})

# Construct bambi model 
formula_as_string = "species ~ sepal_length + sepal_width + petal_length + petal_width"
model= bmb.Model(formula_as_string, df, family="categorical")

# Fit advi 
advi = model.fit(method="advi")
model_fitted = advi.fit(10000)

# Extract information about variational family after optimization  
expected_beta_as_flat_vector =  model_fitted.mean.eval()

My question: how to convert beta from a vector to a matrix?!
There are M=4 covariates and K=3 categories, so with an intercept term, there should be K*(M+1)=15 regression coefficients. We can represent this as a (M+1) x K matrix. Under an identification constraint, one column will be set to the zero vector, leaving (M+1) x (K-1) = 10 entries. expected_beta_as_a_flat_vector gives us those
ten entries, but it is unclear how to reconstruct the (M+1) x K matrix from them.

Martin_Ingram · January 20, 2022, 3:25pm

Hi ashman,

I came across this problem a while back too. What worked for me was to first run ADVI:

res = pm.fit(method='advi', n=100000)

And then use the following code to get the mean and standard deviation dictionaries:

means_pymc3 = res.params[0].eval()
sds_pymc3 = res.std.eval()

pymc3_mean_dict = res.bij.rmap(means_pymc3)
pymc3_sd_dict = res.bij.rmap(sds_pymc3)

Hope this helps, let me know.

ashman · January 20, 2022, 5:09pm

Thanks, Martin. That is very helpful!

I can almost reconstruct the beta matrix from this. The only remaining piece, now, is to determine which category has gotten assigned to have the zero vector for its regression weights on each covariate (and intercept). Is there a way to do this programmatically?

(So I’m now hoping to un-ignore the detail I initially said I’d ignore for the sake of this discussion )

Martin_Ingram · January 20, 2022, 5:31pm

It’s been a while since I used Bambi but I think the first level was the one that was first alphabetically. I may be wrong though — probably a question for @tcapretto !

tcapretto · January 20, 2022, 10:31pm

There are a couple of places where you can look for the information.

The Model object has an attribute called response, which represents the response of the model.

model.response
# ResponseTerm(name: species, shape: (150,), levels: ['setosa', 'versicolor', 'virginica'], baseline: setosa)

There you can see the levels of the response variable are 'setosa', 'versicolor' and virginica, and the level taken as baseline, i.e. with 0 coefficients, is 'setosa'.

You can access this information programmatically

model.response.levels
# ['setosa', 'versicolor', 'virginica']

and then

model.response.baseline
# ['setosa']

You can also see the PyMC coordinates associated with this response term

model.response.pymc_coords
# {'species_coord': ['versicolor', 'virginica']}

and since there’s no 'setosa' you can conclude it is the one being removed.

An additional note

For the moment, Bambi takes the first level as reference. If I recall correctly, we can’t change it with Bambi (at least easily). You can always change the order of the categorical series and Bambi will choose the first one.

I want to keep improving this functionality, so this is subject to change in future releases.

Let me know if you have any other questions

tcapretto · January 20, 2022, 10:43pm

I’m not familiar with ADVI, but there are a couple of things I’m worried about.

The results from ADVI and the results from the default NUTS are quite different (if I look at the posterior means/sds). Are these numbers supposed to match?
I’m almost sure the coefficient for the intercept is not right. We use a centered version of the data when creating the underlying PyMC model and we adjust the Intercept once we have the posterior samples. Unfortunately, we only do that for MCMC, but we don’t do it for ADVI.

I’m sorry but since I’m not familiar with ADVI I couldn’t take care of the issues associated with its implementation. I think there are a couple of things I can take care of but I can’t do it in the immediate future.

Edit

This is where we rescale the intercept.

Martin_Ingram · January 21, 2022, 8:01am

Just a quick comment: @tcapretto when you say ADVI looks off, is that with 10000 iterations (10^4)? If so, I’d also try much larger values, e.g. 10^5 at least. In my experience, 10^4 is rarely enough, and 10^5 can be much better. For a simple model like this I would have thought ADVI might not be so bad. Also, I’ve found it very important to scale the data.

Topic		Replies	Views
How to get named means and sds from ADVI fit? Questions vi	2	396	December 29, 2022
Create model matrix Questions	7	1490	June 23, 2021
Regression model using only categorical variables Questions	9	5932	October 26, 2021
Bambi vs pymc model specification Questions linear_model	5	1320	February 26, 2022
Weights for Model in Bambi Questions bambi	13	1106	March 31, 2023

How to extract the variational posterior expected --matrix-- of regression weights from a categorical model?

Related topics