Saket
Saket

Reputation: 46127

Best data structure for a given set of operations - Add, Retrieve Min/Max and Retrieve a specific object

I am looking for the optimal (time and space) optimal data structure for supporting the following operations:

  1. Add Persons (name, age) to a global data store of persons
  2. Fetch Person with minimum and maximum age
  3. Search for Person's age given the name

Here's what I could think of:

Now, although I keep a hash for better performance of (3), I think it may not be the best way if there are many collisions in the hash. Also, addition of a Person would mean an overhead of adding to the hash.

Is there anything that can be further optimized here?

Note: I am looking for the best (balanced) approach to support all these operations in minimum time and space.

Upvotes: 4

Views: 1327

Answers (3)

MAK
MAK

Reputation: 26586

It looks like that you need a data structure that needs fast inserts and that also supports fast queries on 2 different keys (name and age).

I would suggest keeping two data structures, one a sorted data structure (e.g. a balanced binary search tree) where the key is the age and the value is a pointer to the Person object, the other a hashtable where the key is the name and the value is a pointer to the Person object. Notice we don't keep two copies of the same object.

A balanced binary search tree would provide O(log(n)) inserts and max/min queries, while the hastable would give us O(1) (amortized) inserts and lookups.

When we add a new Person, we just add a pointer to it to both data structures. For a min/max age query, we can retrieve the Object by querying the BST. For a name query we can just query the hashtable.

Your question does not ask for updates/deletes, but those are also doable by suitably updating both data structures.

Upvotes: 1

NPE
NPE

Reputation: 500227

You can get rid of the array as it doesn't provide anything that the other two structures can't do.

Otherwise, a hashtable + min/max is likely to perform well for your use case. In fact, this is precisely what I would use.

As to getting rid of the hashtable because a poor hash function might lead to collisions: well, don't use a poor hash function. I bet that the default hash function for strings that's provided by your programming language of choice is going to do pretty well out of the box.

Upvotes: 4

I82Much
I82Much

Reputation: 27326

It sounds like you're expecting the name to be the unique idenitifer; otherwise your operation 3 is ambiguous (What is the correct return result if you have two entries for John Smith?)

Assuming that the uniqueness of a name is guaranteed, I would go with a plain hashtable keyed by names. Operation 1 and 3 are trivial to execute. Operation 2 could be done in O(N) time if you want to search through the data structure manually, or you can do like you suggest and keep track of the min/max and update it as you add/delete entries in the hash table.

Upvotes: 0

Related Questions