Reputation: 173
I am using defaultdicts to store lists of values where the keys
are periods for which values could be observed.
When looking up from a list of all periods of interest, I would like to find the closest period in my defaultdict (NB: not all periods are stored in the defaultdict).
As defaultdicts are not sorted however, the below approach does not return the correct value.
Is there a different way of returning the closest available key for defaultdicts?
from collections import defaultdict
import numpy as np
def_dict = defaultdict(list)
# entries that will be stored in the defaultdict
reg_dict = {0: ["a", "b"], 2: ["c", "d"], 5: ["k", "h"], -3: ["i", "l"]}
# store items from regular dict in defaultdict
for k, v in reg_dict.items():
def_dict[k] = v
# Lookup periods
periods = [-1, 0, 1, 2, 3, 4, 5, 6, 7, 8]
for period in periods:
# this approach does not return the right keys as defaultdicts are not sorted
closest_key = np.abs(np.array(list(def_dict.keys())) - period).argmin()
print("period: ", period, " - looked up key: ", closest_key)
This returns the following:
period: -1 - looked up key: 0
period: 0 - looked up key: 0
period: 1 - looked up key: 0
period: 2 - looked up key: 1
period: 3 - looked up key: 1
period: 4 - looked up key: 2
period: 5 - looked up key: 2
period: 6 - looked up key: 2
period: 7 - looked up key: 2
period: 8 - looked up key: 2
Upvotes: 0
Views: 138
Reputation: 55489
As Eric said, to do this efficiently you should be using a binary search. However, if the number of keys is small the simple linear search may be adequate. There's no need to use a defaultdict or an OrderedDict, just sort the keys.
import numpy as np
# entries
reg_dict = {0: ["a", "b"], 2: ["c", "d"], 5: ["k", "h"], -3: ["i", "l"]}
keys = np.array(sorted(reg_dict.keys()))
print('keys', keys)
# Lookup periods
periods = np.arange(-1, 9)
for period in periods:
closest_key = keys[np.abs(keys - period).argmin()]
print("period: ", period, " - looked up key: ", closest_key)
output
keys [-3 0 2 5]
period: -1 - looked up key: 0
period: 0 - looked up key: 0
period: 1 - looked up key: 0
period: 2 - looked up key: 2
period: 3 - looked up key: 2
period: 4 - looked up key: 5
period: 5 - looked up key: 5
period: 6 - looked up key: 5
period: 7 - looked up key: 5
period: 8 - looked up key: 5
Upvotes: 1
Reputation: 54263
With an OrderedDict
and sorted keys, you could use a binary search.
For large amount of keys, lookup would be much faster than your current method.
Since you want the closest key, you'd need to find both the rightmost key lower than x and the leftmost key higher than x. After finding the index i
for the rightmost key lower than x, the other candidate (leftmost key higher than x) will be on index i+1
.
You need to make sure that those indices are still in your array.
Finally, you just need to compute the distance to x from those 2 values.
Here's the doc for bisect
and np.searchsorted
Upvotes: 2
Reputation: 1006
I agree with the @septra that you need euqlidean distance, but that's achievable with numpy as well:
from collections import defaultdict
import numpy as np
def_dict = defaultdict(list)
# entries that will be stored in the defaultdict
reg_dict = {0: ["a", "b"], 2: ["c", "d"], 5: ["k", "h"], -3: ["i", "l"]}
# store items from regular dict in defaultdict
for k, v in reg_dict.items():
def_dict[k] = v
periods = [-1, 0, 1, 2, 3, 4, 5, 6, 7, 8]
a = list(def_dict.keys())
for period in periods:
closest_key = np.sqrt(np.power(np.add(a, -period),2)).argmin()
# OR closest_key = np.abs(np.add(a, -period)).argmin()
print("period: ", period, " - looked up key: ", a[closest_key])
Upvotes: 1
Reputation: 1753
The way I understand, you want an output similar to this?
[0, 0, 0, 2, 2, 5, 5, 5, 5, 5]
For the above, the logic would be
closest_key = [min(def_dict.keys(), key = lambda x: abs(x - p)) for p in periods]
Specifying the optional key
parameter to built in python functions is useful in situations like these.
Upvotes: 1