frazman
frazman

Reputation: 33273

Python: find closest key in a dictionary from the given input key

I have a data in form a dictionary.. NOw I take the input from the user and it can be anything.. And I am trying to do the following. If the key exists then cool.. fetch the value from the dictionary. if not, then fetch the nearest (in numeric sense). For example..if the input key is 200 and the keys are like :....

197,202,208...

Then probably 202 is the closest key to 200.. Now, from algorithm point of view. its straight forward.. but is there a pythonic way to do this? Thanks

Upvotes: 30

Views: 30189

Answers (6)

MaxNoe
MaxNoe

Reputation: 15007

Using sortedcontainers.SortedDict, you can do this like this:

def closest_item(sdict, key):
    if len(sdict) == 0:
        raise KeyError('No items in {sdict.__class__.__name__}')

    if len(sdict) == 1:
        return next(iter(sdict.items()))

    idx_before = next(sdict.irange(minimum=key), None)
    idx_after = next(sdict.irange(maximum=key, reverse=True), None)

    if idx_before is None:
        idx = idx_after

    elif idx_after is None:
        idx = idx_before
    else:
        idx = min(idx_before, idx_after, key=lambda x: abs(x - key))

    return idx, sdict[idx]

Upvotes: 0

GrantJ
GrantJ

Reputation: 8709

Rather than using OrderedDict and bisect, consider the SortedDict type in the sortedcontainers module. It's a pure-Python and fast-as-C implementation of sorted list, sorted dict, and sorted set types with 100% testing coverage and hours of stress.

With a SortedDict you can bisect for the desired key. For example:

from itertools import islice
from sortedcontainers import SortedDict

def closest(sorted_dict, key):
    "Return closest key in `sorted_dict` to given `key`."
    assert len(sorted_dict) > 0
    keys = list(islice(sorted_dict.irange(minimum=key), 1))
    keys.extend(islice(sorted_dict.irange(maximum=key, reverse=True), 1))
    return min(keys, key=lambda k: abs(key - k))

The closest function uses SortedDict.irange to create an iterator of keys nearest the given key. The keys are bisected with log(N) runtime complexity.

>>> sd = SortedDict({-3: 'a', 0: 'b', 2: 'c'})
>>> for num in range(-5, 5):
...     key = closest(sd, num)
...     print('Given', num, ', closest:', key)
Given -5 , closest: -3
Given -4 , closest: -3
Given -3 , closest: -3
Given -2 , closest: -3
Given -1 , closest: 0
Given 0 , closest: 0
Given 1 , closest: 2
Given 2 , closest: 2
Given 3 , closest: 2
Given 4 , closest: 2

It's Pythonic to use PyPI!

Upvotes: 19

Brian Larsen
Brian Larsen

Reputation: 1756

This issue is made a lot harder by dict keys being in no particular order. If you can play with how you make the dict so they are in order (like your example) and use python >= 2.7 you can use OrderedDict and bisect to make this lightning fast.

import collections
a = collections.OrderedDict()
for i in range(100):
    a[i] = i

import bisect
ind = bisect.bisect_left(a.keys(), 45.3)

Then you only have to check element ind and ind-1 to see which is closer, thus making a lot fewer calculations.


As pointed out below by Steven G, in Python3 the .keys() is not just a list and must be changed into one.

bisect.bisect_left(list(a.keys()), 45.3)

Upvotes: 39

Will
Will

Reputation: 4528

here's your function on one line:

data.get(num, data[min(data.keys(), key=lambda k: abs(k-num))])

edit: to not evaluate the min when the key is in the dict use:

data[num] if num in data else data[min(data.keys(), key=lambda k: abs(k-num))]

or if all values in data evaluate to True you can use:

data.get(num) or data[min(data.keys(), key=lambda k: abs(k-num))]

Upvotes: 34

comingstorm
comingstorm

Reputation: 26117

If all you have is a Python dictionary, you can't do better than checking all the entries in the dictionary (as in Will's answer). However, if you want to find the closest key more efficiently than that (i.e., in O(log N) instead of O(N)), you want a balanced tree of some sort.

Unfortunately, I don't believe Python has such a datastructure in its standard library -- as the Pythonic way is to use a dict instead. So, if you expect to make a many such queries on a large map, your best choice may be to find an extension library, or even roll your own...

Upvotes: 1

John Doe
John Doe

Reputation: 3506

This should do what you want (minus getting it from a key, but you can figure that out :).

f = lambda a,l:min(l,key=lambda x:abs(x-a))
numbers = (100, 200, 300, 400)
num = int(raw_input())
print 'closest match:', f(num, numbers)

Note: f is from this question.

Upvotes: 0

Related Questions