StopIteration in find_map()

I am trying to fit my data using probabilistic matrix factorization using pypc3 library. I found the following code for doing so:

    with pm.Model() as pmf:
          pmf_U = pm.MvNormal('U', mu=0, tau=alpha_u * np.eye(dim),shape=(n, dim), testval=np.random.randn(n, dim)*.01)
         pmf_V = pm.MvNormal('V', mu=0, tau=alpha_v * np.eye(dim), shape=(m, dim), testval=np.random.randn(m, dim)*.01)
        pmf_R = pm.Normal('R', mu=theano.tensor.dot(pmf_U, pmf_V.T), tau=alpha, observed=train)

    start = pm.find_MAP(fmin=sp.optimize.fmin_powell)  # Find starting values by optimization
    step = pm.NUTS(scaling=start)
    trace = pm.sample(500, step)

The process is starting find_MAP() it work it reached 100% and then I am receiving:

StopIteration exception

Since it seems a common issue but I could not find solution, any idea what is wrong with the code and how can I fit properly my data using PMF?

Can you try with start = pm.find_MAP('Powell')?

I am receiving the message:

WARNING (theano.tensor.blas): Using NumPy C-API based implementation for BLAS functions.

Traceback (most recent call last):
File my_file.py", line 37, in
start = pm.find_MAP(‘Powell’)
File “Anaconda2\lib\site-packages\pymc3\tuning\starting.py”, line 57, in find_MAP
update_start_vals(start, model.test_point, model)
File “Anaconda2\lib\site-packages\pymc3\util.py”, line 148, in update_start_vals
a.update({k: v for k, v in b.items() if k not in a})
AttributeError: ‘str’ object has no attribute ‘update’

Could you please try update to the newest release of PyMC3?

Should i do pip install pymc3–upgrade? I tried to re-install and install the library and I got the same mistake.

that usually should do - can you import pymc3 and check the version? Likely it is installed in another python environment.

I am receiving some warning for g++ and c++ (using pymc3.version …) and then the version is 3.4.1.

oops - should be start = pm.find_MAP(method='Powell')

1 Like

Cool it seems to work!! Thanks for the help. I am not sure but it takes a good amount of time to perform for my small matrix. Does it make sense?

No - Maybe try with the default option? start = pm.find_MAP()

I think this is what it takes a lot of time:

step = pm.NUTS(scaling=start)

Ok I think that i just need to revise the documentation. Thanks for the help again!

Oh that part is really not recommended now. If you are doing sampling, you should always go with the default first eg: trace = pm.sample(1000, tune=1000)

This line trace = pm.sample(1000, tune=1000) for the default example that I found fromthis post it takes around 1h for training. From the other sife if i have only start = pm.find_MAP() then if i take the results using start[“U”] and start[“V”] the reconstructed matrix tends to have everything zero. Furhremore, is it that pm.samples() is corresponding to BMF?

find_MAP returns all zeros because the gradient around the initialized value (all zeros) is very small, the optimizer just cannot converge to a satisfy solution.

Both pm.sample and pm.find_MAP are inference of the same BMF model.

I recently updated the PMF doc: http://docs.pymc.io/notebooks/probabilistic_matrix_factorization.html, which seems that there are quite some problem of sampling from these kinds of models due to label switching. The solution as shown in the notebook is somewhat OK - it finds a local mode using ADVI initialization and then sample around the mode. The dangerous is that it silently ignore part of the parameter space, which makes some operation using the MCMC samples invalid. So depending on what is your aim of using such model, you might choose to do inference following a similar receipt as in this doc.

I just want to remove some values from my data and replace them with NaN and try to see if I can rerpoduce them after fitting with PMF and BMF algorithms. I did it efficiently already using NMF and I was curious to check the performance of those algorithms.