Reputation: 5579
I have a big matrix which cells represent the number of occurrences of a word (row) in a text document (column).
counts = rand(567840,799); % 567840 words,799 text documents
Without executing a loop I need to:
1) extract the indeces of the words that occurr in at least the 90% of the text documents
2) extract the indeces of the words that occour max 2 times in all the collection of documents.
for the point 2 I would do
idx_2 = find(sum(counts,2)<=2);
I am struggling with point 1... Can you help me?
Upvotes: 0
Views: 179
Reputation: 47402
For 1 you can do
idx1 = find(mean(counts > 0, 2) >= 0.9);
and for 2 as you said
idx2 = find(sum(counts, 2) <= 2);
Edit - I see Luis Mendo already mentioned this in the comments so I have marked this community wiki.
Upvotes: 1