OneMoreError
OneMoreError

Reputation: 7728

Searching through a dictionary quickly

I need to store a few hundred strings in a data structure. Every string has two fields associated with it, like say word meaning and its origin.I can store the words in any fashion, say sorted,reverse sorted or anyway you like.


I just need to search for a string within the dictionary as quickly as possible and fetch the two related fields. If possible, I want my search to be even better than the binary search.


I am using Java. Which data structure or Collection Class should I use ?


Note : I do not want to use database in this.

Upvotes: 2

Views: 2041

Answers (4)

amit
amit

Reputation: 178411

You can use a HashMap<String,MyDataObject> - it will be fastest and simplest to use.

Average seek time is O(|S|), where |S| is the string's length.

You can also try and use a trie or a radix tree, but make sure you want to give the time for it by profiling the HashMap solution before you start working on it.

Upvotes: 6

premprakash
premprakash

Reputation: 1595

Use HashTable or HashMap

your structure should look something like this HashMap<String,Bookcontent>

where BookContent is a class with attributes word meaning and origin

Upvotes: 1

Marko Topolnik
Marko Topolnik

Reputation: 200138

The obvious answer is "use a HashMap", but it is not without caveats. Every string you search for will need its hashcode calculated. If you use a new object every time, you pay O(s) every time (s being string length in this case), plus another O(s) for an equals check.

One way around this is to intern all the strings you use for searching. This will ensure the once-calculated hashcode is reused and will also short-circuit the ensuing equals check.

Another option is using a trie. Its advantage is that you pay at most O(s), but generally less—it is a prefix-based search, so a soon as you traverse up to the point where your prefix is unique, you get the result.

In conclusion, if you can arrange for the reuse of interned strings, a hashcode-based solution is optimal; if not, a trie is a superior choice.

Other commonplace options would be a skip list (used in Lucene) and B-tree (common in database indexes).

Upvotes: 2

Amarnath
Amarnath

Reputation: 8855

I suggest you to use Trie data structure. I have done an assignment something similar to this. This link helps you in implementing Trie DS.

Upvotes: 1

Related Questions