Reputation: 200
I have a text file containing a sorted list of words being my dictionary.
I would like to use a TreeMap
in order to have log(n) as average cost when I have to see if a words belongs to the dictionary or not (that is containsKey
).
I have read of the Black-Read tree being behind the scenes of the TreeMap
, so it is self balancing.
My question is: which is the best way to feed the TreeMap
with the list of words?
I mean: feeding it with a sorted list should be the worst case scenario for a binary tree, because it have to balance almost every other word, haven't it?
The list of words can vary from 7K to 150K in number.
Upvotes: 2
Views: 528
Reputation: 12932
TreeMap
hides its implementation details, as good OO design prescribes, so to really optimize for your use case will probably be hard.
However, if it is an option to read all items into an array/list before adding them to your TreeMap
, you can add them "inside out": the middle element of the list will become the root, so add it first, and then recursively add the first half and second half in the same manner. In fact, this is the strategy that the TreeMap(SortedMap)
constructor follows.
If it is not an option to read all items, I think you have no other option than to simply put your entries to the map consecutively, or write your own tree implementation so that you have more control over how to generate it. If you at least know the number of items beforehand, you should be able to generate a balanced tree without ever having to rebalance.
If you do not need the extra features of a TreeMap
, you might also consider using a HashMap
, which (given a good hash function for your keys) even has O(1) access.
Upvotes: 1