Nonnegative matrix factorization in Sklearn

Question

I am applying nonnegative matrix factorization (NMF) on a large matrix. Essentially the NMF method does the following: given an m by n matrix A, NMF decomposes into A = WH, where W is m by d and H is d by n. The ProjectedGradientNMF method is implemented in Python package Sklearn. I would want the algorithm return both W and H. But it seems that it only returns H, not W. Applying the algorithm again to A.T (the transpose) could give me W. However, I would want to avoid computing it twice since the matrix ix very large.

If you could tell me how to simultaneously get W and H, that would be great! Below is my code:

from sklearn.decomposition import ProjectedGradientNMF
import numpy
A = numpy.random.uniform(size = [40, 30])
nmf_model = ProjectedGradientNMF(n_components = 5, init='random', random_state=0)
nmf_model.fit(A)
H = nmf_model.components_.T

user3834473 · Accepted Answer

Luckily you can look through the source code:
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/decomposition/nmf.py

fit_transform() starts at line 460, and at line 530 it shows that H gets attached to components_ and W is returned from the function.

So you shouldn't have to run this twice, you should just change:

nmf_model.fit(A);
H = nmf_model.components_.T;

to

W = nmf_model.fit_transform(A);
H = nmf_model.components_;

Nonnegative matrix factorization in Sklearn

Answers (1)

Related Questions