Extracting variable means from ADVI fitted model

david.cortes · December 10, 2017, 6:33pm

I’d like to know how to extract the estimated means from a model that was fit using advi. In the documentation, it’s shown like this
fitted_model.mean.eval()
Which returns a flat 1-D array with all the estimated means, and playing around with the object properties, I find other similar methods that can return the same 1-D array.
However, I’m wondering in what order the variables in there are inserted when fitting some model with many different variables.
My first thought would be that it would put the variables in the order they are defined, but if I try to slice the array like that to extract the means, then they don’t end up matching – I checked it by comparing it to the mean of sampled examples:

samples = fitted_model.sample()
X_mean = np.mean(np.dstack([s[‘X’] for s in samples]), axis=1)

Is there any function to return the estimated means in a dictionary like the sample method does?

junpenglao · December 11, 2017, 6:57am

In a PyMC3 model there is a property to map dict to array and vice versa under
model.bijection. A similar property is in the approximation class, for example, you can do:

    approx = pm.fit()
    # dict
    means_dict = approx.bij.rmap(approx.params[0].eval())
    # array
    sample_array = approx.bij.map(approx.sample(1))

david.cortes · December 15, 2017, 7:07pm

Thanks. Now another question: the dict contains variables with names ending in _log, and I tried a model where I defined variables from non-negative distributions and got negative numbers there, so I’m guessing I should take e^value there, but after I do that, the numbers still don’t match with what I get from a sample mean. Do I need to perform some other transformation?

junpenglao · December 15, 2017, 7:47pm

Doing exp(a_log__) is the correct way, what is this sample mean you comparing with?

david.cortes · December 15, 2017, 8:43pm

I mean comparing with taking the mean of the samples returned by approx.sample() - i.e. it doesn’t match with

np.exp(approx.bij.rmap(approx.params[0].eval()['variable_of_interest']))

However, they seem to match a bit better if I try 2^value instead of e^value.

EDIT: sorry, that 2^value was just a coincidence, didn’t end up matching in further experiments. Nevertheless, np.exp(<bij.rmap means>) and approx.sample() values still don’t match.

junpenglao · December 15, 2017, 11:00pm

They are not the same. approx.sample() is sampling from the approximation with parameters in approx.param. For example, if you use ADVI then calling approx.sample() is sample from a MvNormal with the mean and std stored in approx.param

david.cortes · December 16, 2017, 1:21pm

Sorry, maybe I posed the question incorrectly. Let’s say I want some method to extract the means for each variable from which approx.sample() generates samples, ideally returned in a dictionary, so as to match what I would get from taking the mean of
[sample[‘variable_of_interest’] for sample in approx.sample()]

junpenglao · December 16, 2017, 1:48pm

If you want to extract the means for each variable from the samples returned by approx.sample(), what you are doing is correct. And when you take a lot of samples, this mean should be really close to the mean extracted from approx.params.

david.cortes · December 18, 2017, 9:29am

Been playing with it, and it works in some experiments when the approximated variables are Normal. However, in this example (hierarchical poisson factorization) I keep getting different numbers (results get even more different when using real data):

nentries = 1000
nrows = 100
ncols = 100
latent_dim = 3

np.random.seed(1)
x = np.round(np.random.exponential(20, size=nentries),0).astype('float32')
row_id = np.random.randint(nrows, size=nentries)
col_id = np.random.randint(ncols, size=nentries)

with pm.Model():
    user_activity=pm.Gamma('user_activity', .3, .3, shape = (nrows,1) )
    user_prior=theano.tensor.tile(user_activity, latent_dim, ndim=2)
    theta=pm.Gamma('theta', .3, user_prior, shape=(nrows,latent_dim))
    
    item_pop=pm.Gamma('item_pop', .3, .3, shape = (ncols,1) )
    item_prior=theano.tensor.tile(item_pop, latent_dim, ndim=2)
    beta=pm.Gamma('beta', .3, item_prior, shape=(ncols,latent_dim))
    
    xhat=theano.tensor.sum(theta[row_id] * beta[col_id], axis=1)
    R=pm.Poisson('R',mu=xhat, observed=x)
    HPF=pm.fit()

Then:
np.mean(np.dstack([sample['theta'] for sample in HPF.sample(10000)]), axis=2)-np.exp(HPF.bij.rmap(HPF.params[0].eval())['theta_log__'])

Is that the expected behavior? (all the numbers from approx.sample() are higher than from approx.mean.eval())

junpenglao · December 18, 2017, 10:40am

You are right, I was not clear on my side. The correct way to do is:

trace = HPF.sample(10000)
plt.plot(np.log(trace['theta']).mean(0).flatten())
plt.plot(HPF.bij.rmap(HPF.params[0].eval())['theta_log__'].flatten());

Notice that, the mean of log(theta) is equal to theta_log__ from the approximation, but not the other way around. The reason being that the expecation of a function over x E(f(x)) is not equal to the function of the expectation f(E(x)).

You can check that on each sample, the forward and backward is equally valid:

trace = HPF.sample(1)
np.testing.assert_almost_equal(trace['theta_log__'], np.log(trace['theta']))
np.testing.assert_almost_equal(np.exp(trace['theta_log__']), trace['theta'])

But this is not the case when you take the expectation/mean over all the values.
Hope this is now clearer

david.cortes · December 18, 2017, 10:55am

Oh right! Realized my mistake in the reasoning, thanks for your help!

Dushyant_Sahoo · March 27, 2021, 11:41pm

I have similar problem. I am running the below code:
advi_fit = pm.fit(method=pm.ADVI(), callbacks=[CheckParametersConvergence()])

When I run below commands to get mean and standard deviation:
means_dict = advi_fit.bij.rmap(advi_fit.params[0].eval()) and means_dict = advi_fit.bij.rmap(advi_fit.params[1].eval())

I see some variables renamed by adding _lowerbound__ , for example- alpha_lowerbound__ and some have kept their naming structure.

I am not sure what the lowerbound means here and also most of the standard deviation values are negative. I used truncated normal with lower bound = 0, for those values mean value is less than zero. I think I lack an understanding of the output of the code. Ideally, I am looking to get the mean and standard deviation values of the variables.

Topic		Replies	Views
Labels of means from variational approximations? Questions	2	668	November 28, 2020
Negative values while extracting variable means and standard deviation from ADVI fitted model Questions	0	300	March 28, 2021
How to get named means and sds from ADVI fit? Questions vi	2	416	December 29, 2022
Sampling a gaussian using ADVI Questions	3	700	April 19, 2018
ADVI : Collecting Values for Specific Variables Questions	5	902	May 1, 2018

Extracting variable means from ADVI fitted model

Related topics