find_MAP giving all zeros for sparse data

_eigenfoo · August 28, 2018, 1:20pm

Hello!

I’m trying to re-implement probabilistic matrix factorization (e.g. as described in section 2 in this paper). Below is the gist of my code. tfidf is a (somewhat) large scipy.sparse.csr.csr_matrix, with entries that are positive and no larger than 1.0. Notice that I’m jumping through hoops with my definition of R_nonzero in order to avoid having my likelihood take all the zeros into account (i.e., this is how I implement the I_{ij} described in the paper).

rows, columns, entries = scipy.sparse.find(tfidf)

n, m = tfidf.shape
dim = 20
sigma = 0.15
sigma_u = 0.02
sigma_v = 0.02

with pm.Model() as pmf:
    U = pm.Normal('U', mu=0, sd=sigma_u, shape=[n, dim])
    V = pm.Normal('V', mu=0, sd=sigma_v, shape=[m, dim])
    R_nonzero = pm.Normal('R_nonzero',
                          mu=tt.sum(np.multiply(U[rows, :], V[columns, :]), axis=1),
                          sd=sigma,
                          observed=entries)
    
    map_estimate = pm.find_MAP()

The problem is that map_estimate now comes up with U and V to be entirely zero matrices! find_MAP also only takes 2 iterations in order to converge, which I find a bit suspicious… is it possible that find_MAP is stopping too early, somehow?

Thanks for your time!

_eigenfoo · August 28, 2018, 1:35pm

Hm, actually, the same paper mentions that probabilistic matrix factorization usually sucks for sparse or imbalanced data, and that I should probably try Bayesian matrix factorization. So if there isn’t an obvious bug/suggestion out there, it might just be a bad model

junpenglao · August 30, 2018, 5:19am

I think there are problem with the default optimizer - if I remember correctly in the original notebook another optimizer from scipy was choosen.

_eigenfoo · August 30, 2018, 12:56pm

The original notebook set method='L-BFGS-B', which is the default (weird, but alright).

I tried it again with method='Powell', and it returns slightly better results - V is no longer entirely zero, but U still is. The Powell method also does not use any gradient information, so that seems suspicious…

rohanchauhan · August 30, 2018, 3:55pm

Hi,

I think MAP always takes less time. Secondly, if U and V are coming entirely zero matrices, maybe try setting a different prior such as pm.HalfNormal or pm.Lognormal. However, I don’t have much idea about probabilistic matrix factorisation. So, these are just my guesses which you can try.

junpenglao · August 30, 2018, 7:15pm

speaking of that, I think setting a new initial value might also help. But in general MAP is not very reliable in complex problems.

_eigenfoo · August 31, 2018, 5:51am

Do you mean PyMC3’s implementation of find_MAP, or just MAP methods in general?

junpenglao · August 31, 2018, 6:04am

In general, always skeptical using one point in a complex space to represent the whole space.

Topic		Replies	Views
StopIteration in find_map() Questions development , theano	14	1673	June 4, 2018
Is there a way to avoid find_map starting slow the first time it is being called?	7	82	February 17, 2025
Unconsistent logp, Evaluating the results of find_MAP on Custom Likelihood Questions	3	587	August 28, 2018
What makes difference find_MAP() and pm.sample() Questions	2	2364	October 17, 2021
Frequently Asked Questions Questions	12	25045	June 30, 2023

find_MAP giving all zeros for sparse data

Related topics