Reputation: 651
Let's say I have a very large dictionary like where the key is a tuple. Here is an example:
dic = {('A', 'B'): 1,
('A', 'C'): 2,
('B', 'C'): 3}
I am trying to filter the values of dict by only one element of the key. Here is what I am doing:
my_list = []
for k, v in dic.items():
if "A" in k:
my_list.append(v)
print(my_list)
[1, 2]
I am wondering is there a better way to perform this without looping through?
Upvotes: 1
Views: 1638
Reputation: 83
I would restructure your dictionary.
dic = {("A", "B"): 1, ("A", "C"): 2, ("B", "C"): 3}
This is pretty much the same as:
better_dictionary = {"A": [1,2], "B": [1,3], "C": [2,3]}
Here is some code:
dic = {("A", "B"): 1, ("A", "C"): 2, ("B", "C"): 3}
better_dictionary = {}
# (one time) init better dict: O(n) -- with n being the number of tuples
for k, v in dic.items():
for i in k:
if i in better_dictionary:
better_dictionary[i].append(v)
else:
better_dictionary[i] = [v]
my_list = better_dictionary["A"] # get values: O(1)
print(my_list)
# [1, 2]
Upvotes: 2
Reputation:
The list comprehension suggested by @Yevgeniy Kosmak is the most Pythonic way imo. However, if you don't want an explicit loop, then you could use filter
and zip
(which is pretty ugly imo):
list(zip(*filter(lambda x: 'A' in x[0], dic.items())))[1]
Here, with filter
, we create an iterable of 2-tuples where the first elements are keys with A
in it and the second elements are values of dic
.
Then we unpack this iterable and use zip
to create an iterable of tuples (we want the second element).
Output:
(1, 2)
If you need to do this multiple times, it's maybe a better idea to create another dictionary where keys are the first elements of the keys in dic
and values are list of values in dic
whose key's first elements match. That's suggested by @luku. Another way to do the same job is to use dict.setdefault
method to traverse dic
and append to values:
out = {}
for (k,_),v in dic.items():
out.setdefault(k,[]).append(v)
Then you can do:
>> print(out['A'])
[1, 2]
Upvotes: 1
Reputation: 3850
More pythonic for me:
dic = {('A', 'B'): 1,
('A', 'C'): 2,
('B', 'C'): 3}
dic = [el[1] for el in dic.items() if 'A' in el[0]]
print(dic)
# Returns [1, 2]
Or with tuple unpacking:
dic = [v for k, v in dic.items() if 'A' in k]
Upvotes: 2
Reputation: 37339
With the information that you're searching repeatedly, this can be improved. Basically, you want to avoid traversing the entire data structure repeatedly. As long as memory permits, you can precalculate either a map from elements of the keys to a set of the matching keys or just to the values themselves.
from collections import defaultdict
index = defaultdict(set)
for k in dic:
for element in k:
index[element].add(k)
Then something like:
my_list = [dic[k] for k in index["A"]]
This might still underperform if you're just doing a couple of searches, since building the index is more expensive than one single search. You'd want to profile to determine that.
There are variations possible on this - for instance you can maintain the index as a parallel structure to the dict itself, modifying it as you add and delete elements. That's more complex but means you distribute the cost of computing the index. And don't have to do it more than once unless something goes wrong. But this is one basic idea.
Upvotes: 1