sweeeeeet
sweeeeeet

Reputation: 1819

storing values with keys then searching the smartest way, ALGORITHMS

I've got a stream with > 20 millions of values which come with their corresponding key (> 10 millions). The keys are linked to one or more values (max 50000), example:

... (key1, val1), (key2,val2), (key1, val3), (key2, val4), (key1, val6), (key3,val5)...

I store this stream as follows:

key1 : val1, val3, val6

key2 : val2, val4

key3 : val5

Each time I receive a new value in the stream, I first check if this value appears in the list of its corresponding key:

My question is: what's the more efficient data structure or tools to perform this process (I want to launch the flag the faster possible). I thought of a hash table associated with linked list (as I give in the example), but checking all the linked list each time I add a value does not sound right. Recall that I do need this notion of LAST value.

Thank you

Upvotes: 1

Views: 41

Answers (1)

Petar Ivanov
Petar Ivanov

Reputation: 93020

Checking if the new value is in the list is not optimal - it takes O(n) time to check.

You can use a hashtable instead. You can store the last value separately and update it on insert.

So you have a hashtable, where the values are pairs. Each pair consists of a hashtable (used as a set) and an element (the last element in the set).

Your example looks like this:

(key1 -> (val6, (val1->1, val3->1, val6->1))
(key2 -> (val4, (val2->1, val4->1)
(key3 -> (val5, (val5->1))

You can optimize the cases when the set only contains one element, by not storing the last value explicitly.

Upvotes: 2

Related Questions