Reputation: 431
I have a scenario like this, I need to store the count of the strings and I need to return the top ten strings which has the max count,
for example,
String Count
---------------------------------
String1 10
String2 9
String3 8
.
.
.
String10 1
I was thinking of using the hashtable for storing the string and its count but it would be difficult to retrieve the top ten strings from it as I have to again loop through it to find them.
Any other suggestions here?
Upvotes: 1
Views: 156
Reputation: 132
You need a Max Heap data structure for this. Put it all into the max heap, and make successive 10 (or whatever n) removals.
And if you intend to keep reusing the data once it's loaded into memory, it might be worth the expense of sorting by value instead of a heap.
Upvotes: 0
Reputation: 114757
Simply use a sorted map like
Map<Integer, List<String>> strings
where the key is the frequency value and the value the list of strings that occur with that frequency.
Then, loop through the map and, with an inner loop through the value lists until you've seen 10 strings. Those are the 10 most frequent one's.
With the additional requirement that the algorithm should support updates of frequencies: add the strings to a map like Map<String, Integer>
where the key is the string and the value the actual frequency (increment the value if you see a string again). After that copy the key/value pairs to the map I suggested above.
Upvotes: 3
Reputation: 11805
Guava has a HashMultiset which would be very useful for this.
HashMultiset<String> ms = Hashmultiset.create();
ms.add(astring);
ms.add(astring, times);
ImmutableMultiset<String> ims = Multisets.copyHighestCountFirst(ms);
// iterator through the first 10 elements, and they will be your top 10
// from highest to lowest.
Upvotes: 0
Reputation: 3170
I am not sure, but i guess that the most suitable and elegant class for your needs is guava's http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/collect/TreeMultiset.html
Upvotes: 0
Reputation: 28482
For any task like "find N top items" priority queues are perfect solution. See Java's PriorityQueue class.
Upvotes: 0
Reputation: 906
Priority Que.
You can make a class to put in it:
public class StringHolder{
private String string;
private int value;
//Compare to and equals methods
}
then it will sort when you insert and it is easy to get the top 10.
Upvotes: 4