Hey community!

I built a probabilistic model and I want to investigate the convergence property of latent parameter every N steps. The procedure can be summarized as follows:

``````import pymc3 as pm
means = None
for dummy_i in xrange(10):
with my_model:
(means, std, elbos) = pm.variational.advi(n=N, start=means)
check_means(means)
``````

I intend to track the model performance with iterations. However, I found it is relying on the step length N. In my case, when N=5000, I found that the model performance( AUC) is fluctuating at around 0.6. N =10000, the model performance is approaching 1 and the final model performance is around 0.9. It seems that stopping advi and continuing with the means of parameters gets a different result.
So my problem is how can I stop iteration and continue?

1 Like

The advi API you are using is a bit dated, if you update to master and use the new API, you can do the following:

``````with pm.Model() as model:
...
# set up the advi inference

approx = inference.fit(n=10000)
``````

Now you have an approximation class `approx`, which you can check the history of elbo, and also the meanfield approximation mu and sd

``````elbos1 = -approx.hist # shape = (10000,)
# get the fitted value
gbij = approx.bij
cov = approx.cov.eval()
means = gbij.rmap(approx.mean.eval())
sds = gbij.rmap(np.diag(cov)**.5)
``````

You can now check the fit by ploting the elbo or using the `means` and `sds`, if you want to further train the model, just do:

``````inference.fit(n=20000)
``````

and again do

``````elbos1 = -approx.hist # shape = (10000,) + (20000,) = (30000,)
# get the fitted value
gbij = approx.bij
cov = approx.cov.eval()
means = gbij.rmap(approx.mean.eval())
sds = gbij.rmap(np.diag(cov)**.5)
``````

to get the updated means an sds

2 Likes

Thanks for your detailed instruction @junpenglao . Very helpful! However I still have some problem.
I defined my model by following codes:

``````import pymc3 as pm
# y is  a list of input matrices. and a, b alpha are parameters.
y = input_mats
a = args.a
b = args.b
alpha = 2
N = y.shape
n_source = len(y)
dim = args.k
M = [y[c].shape for c in range(n_source)]
prior_mu = np.zeros([N, dim])
thetas = {}
with pm.Model() as model:
sigma = pm.InverseGamma('sigma', alpha=alpha, shape=n_source)
x = pm.Laplace('x', mu=prior_mu, b=b, shape=(N, dim))
for c in xrange(n_source):
thetas['theta'+str(c)] = pm.Dirichlet('theta'+str(c),
a=(a / dim) * np.ones((M[c], dim)), shape=(M[c], dim))
for c in xrange(n_source):
pm.Normal('y'+str(c), mu=theano.tensor.dot(x, thetas['theta'+str(c)].T),
sd=pm.math.sqrt(sigma[c]) * np.ones((N, M[c])),  observed=y[c])
``````

I can use ADVI by the old interface:

``````with model:
``````

This works fine for me. However, when I use the new API:

``````approx = inference.fit(n=10000)
``````

It takes a very long time to start iteration.

Here you can see that inference.fit() takes much longer time (40 mins) to start iteration.
ps: My OS is ubuntu 16.04 and I use anaconda python 2.7 and cuda-8.0.

The two for loop is likely inefficient. You should try to rewrite it.

There might be ways to check which part of the initialization is slow using some profiling, @ferrine?

Also, regarding to your original goal (i.e., checking the convergence property of latent parameter every N steps), you can write your own callback function.

Alternatively, you can copy the old variational API and change the code in L181-L201 to check the convergence internally (the old API will be removed soon).

I suppose theano does graph optimizations and that takes a lot of time

Thanks for your advice. The parameter `n_source` is often a small number, e.g. 2 or 3.
I profile those two line using `%%prun` in ipython.
The results are as follows:

``````%%prun
with model:
``````

The output file is like:

New API:

``````approx = inference.fit(n=10000)
``````

The output is:

It seems pickle.py and {repr} take the majority of running time.

Try to do `theta.append()` and indexing it with `theta[c]` instead of `thetas['theta'+str(c)]`

I tried, but it still doesn’t work. I modified the old variational API as you suggested. Thanks 