Mahout text mining - most important words for a given singular value

Question

Question: Is there an easy way to see the most important words associated with each singular value?

Background: I have applied Mahout’s singular value decomposition tool to a collection of news articles. The articles come from two topics: 1) sports, and 2) business. I would like to see the most important words associated with each singular value. For example, for one singular value I might expect the most prominent words to be sports terms: score, team, player, coach. For another singular value I might expect to see business terms: company, profit, revenue.

My Approach: I am considering making a file for each singular value, where -- for a given singular value -- the words are ordered in descending order of importance. This is just an idea. I'm open to suggestions.

Below is the code I have used so far to generate Mahout's singular value:

/mahout-distribution-0.7/bin/mahout svd 
-i /vectors/tfidf-vectors/
-o /svd-values/
--numRows 100 
--numCols 591 
-r 100

Mahout text mining - most important words for a given singular value

Answers (1)

Related Questions