I built a probabilistic model and I want to investigate the convergence property of latent parameter every N steps. The procedure can be summarized as follows:
import pymc3 as pm
means = None
for dummy_i in xrange(10):
with my_model:
(means, std, elbos) = pm.variational.advi(n=N, start=means)
check_means(means)
I intend to track the model performance with iterations. However, I found it is relying on the step length N. In my case, when N=5000, I found that the model performance( AUC) is fluctuating at around 0.6. N =10000, the model performance is approaching 1 and the final model performance is around 0.9. It seems that stopping advi and continuing with the means of parameters gets a different result.
So my problem is how can I stop iteration and continue?
Thanks in advance.
Thanks for your detailed instruction @junpenglao . Very helpful! However I still have some problem.
I defined my model by following codes:
import pymc3 as pm
# y is a list of input matrices. and a, b alpha are parameters.
y = input_mats
a = args.a
b = args.b
alpha = 2
N = y[0].shape[0]
n_source = len(y)
dim = args.k
M = [y[c].shape[1] for c in range(n_source)]
prior_mu = np.zeros([N, dim])
thetas = {}
with pm.Model() as model:
sigma = pm.InverseGamma('sigma', alpha=alpha, shape=n_source)
x = pm.Laplace('x', mu=prior_mu, b=b, shape=(N, dim))
for c in xrange(n_source):
thetas['theta'+str(c)] = pm.Dirichlet('theta'+str(c),
a=(a / dim) * np.ones((M[c], dim)), shape=(M[c], dim))
for c in xrange(n_source):
pm.Normal('y'+str(c), mu=theano.tensor.dot(x, thetas['theta'+str(c)].T),
sd=pm.math.sqrt(sigma[c]) * np.ones((N, M[c])), observed=y[c])
inference = pm.ADVI()
I can use ADVI by the old interface:
with model:
means, std, elbos = pm.variational.advi(n=10000)
This works fine for me. However, when I use the new API:
Here you can see that inference.fit() takes much longer time (40 mins) to start iteration.
ps: My OS is ubuntu 16.04 and I use anaconda python 2.7 and cuda-8.0.
The two for loop is likely inefficient. You should try to rewrite it.
There might be ways to check which part of the initialization is slow using some profiling, @ferrine?
Also, regarding to your original goal (i.e., checking the convergence property of latent parameter every N steps), you can write your own callback function.
Alternatively, you can copy the old variational API and change the code in L181-L201 to check the convergence internally (the old API will be removed soon).
Thanks for your advice. The parameter n_source is often a small number, e.g. 2 or 3.
I profile those two line using %%prun in ipython.
The results are as follows:
%%prun
with model:
means, std, elbos = pm.variational.advi(n=10000)