Using bisect efficiently with the keys of an OrderedDict

Question

I'm looking for the fastest way to do the following: given a dictionary and a key value, return the lowest key in the dictionary greater than than the value given. Per this question, the natural way would seem to be to create an OrderedDict, then use bisect on the keys to find the proper key location. The OrderedDict.keys() method doesn't support indexing, so per e.g. this question, one has to convert the keys to a list, before doing bisect or similar.

So once an OrderedDict has been created with its keys in order, in order to access the correct position one has to do the following:

Convert the keys to a list
Do a binary search of the keys with bisect or similar.
Check that this insertion point isn't at the end of the list, before retrieving the key located after this index.
Retrieve the key value in our original OrderedDict.

I'm most concerned about step 1 above, from an efficiency perspective (although all of this looks roundabout to me). Without knowing the details of how Python does the conversion to list, it seems like it would have to be O(n), completely eliminating the savings of using OrderedDict and binary search. I'm hoping someone can tell me whether this assumption I have about step 1 is or isn't correct, and regardless whether or not there may be a better method.

As an alternative, I could simply create a list of tuples, sorted by the first element (key), where the second element is the dict value associated with that key. Then I could pass the key lambda x:x[0] to bisect. This seems reasonable, but I'd prefer to store my key / value pairs more uniformly (e.g. JSON), since that's how it's done with other dicts in the same project that don't need this specific type of comparison.

Here's some example code for a single lookup. Edit: But lest anyone think I'm over-optimizing, the actual dictionary has ~3 million keys, and will be accessed ~7 million times in a batch, daily. So I'm very interested in finding a fast way of doing this.

# Single lookup example
from collections import OrderedDict
from bisect import bisect

d = OrderedDict()
d[5] = 'lowest_value'
d[7] = 'middle_value'
d[12] = 'highest_value'

sample_key = 6    # we want to find the value for the key above this in d, e.g. d[7]

list_of_keys = list(d.keys())
key_insertion_index = bisect(list_of_keys,sample_key)

if key_insertion_index < len(list_of_keys):
    next_higher_key = list_of_keys[key_insertion_index]
    v = d[next_higher_key]

Using bisect efficiently with the keys of an OrderedDict

Answers (0)

Related Questions