Reputation: 211
I want to store some words and their occurrence times in a website, and I don't know which structure I should use.
Every time I add a word in the structure, it first checks if the word already exists, if yes, the occurrence times plus one, if not, add the word into the structure. Thus I can find an element very fast by using this structure. I guess I should use a hashtable or hashmap, right?
And I also want to get a sorted list, thus the structure can be ranked in a short time.
Forgot to mention, I am using Java to write it.
Thanks guys! :)
Upvotes: 0
Views: 157
Reputation: 68715
Define a Hashmap with word as the key and counter as the value
Map<String,Integer> wordsCountMap = new HashMap<String,Integer>();
Then add the logic like this:
Upvotes: 1
Reputation: 2444
Any Map Implementation Will Do. If Localized Changes prefer HashMap otherWise ConcurrentHashMap for multithreading.
Remember to use any stemming Library. stemming library in java for example working and work logically are same word.
Remember Integer is immutable see example below Example :
Map<String, Integer> occurrence = new ConcurrentHashMap<String, Integer>();
synchronized void addWord(String word) { // may need to synchronize this method
String stemmedWord = stem(word);
Integer count = occurrence.get(stemmedWord)
if(count == null) {
count = new Integer(0);
}
count ++;
occurrence.put(stemmedWord, count);
**// the above is necessary as Integer is immutable**
}
Upvotes: 0
Reputation: 886
So, you could use HashMap, but don't forget about multythreading. Is this data structure could be accessed throught few thread? Also, you could use three map in a case that data have some hirarchy (e.g. in a case of rakning and sort it by time). Also, you could look throught google guava collections, probably, they will be more sutabile for you.
Upvotes: 0
Reputation: 14709
A HashMap seems like it would suit you well. If you need a thread-safe option, then go with ConcurrentHashMap.
For example:
Map<String, Integer> wordOccurenceMap = new HashMap<>();
"TreeMap
provides guaranteed O(log n) lookup time (and insertion etc), whereas HashMap
provides O(1) lookup time if the hash code disperses keys appropriately. Unless you need the entries to be sorted, I'd stick with HashMap
." -part of Jon Skeet's answer in TreeMap or HashMap.
Upvotes: 2
Reputation: 18868
TreeMap is the better solution, if you want both Sorting functionality and counting words. Custom Trie can make more efficient but it's not required unless you are modifying the words.
Upvotes: 1