Reputation: 113
So I have this tricky dictionary of tuples which I want to filter based on the first occurrence of the informative flag in the value elements. If the flag (which is the element occupying the first position of the tuple) is observed in other keys I will only retain only the first key-value pair in which it occurs and subsequent key-value pairs which contain the flag would be skipped.
old_dict = {'abc':[('abc', '1', '5'), ('def', '1', '5'), ('abcd', '2', '5')],
'def':[('abc', '2', '5'), ('def', '1', '5'), ('abcd', '1', '5')],
'ghi':[('ghi', '1', '5'), ('jkl', '1', '4'), ('mno', '2', '4')]}
I have struggled with a lot of attempts and this latest attempt does not produce anything meaningful.
flgset = set()
new_dict = {}
for elem, tp in old_dict.items():
for flg in tp:
flgset.add(flg[0])
counter = 0
for elem, tp in old_dict.items():
for (item1, item2, item3) in tp:
for flg in flgset:
if flg == item1:
counter = 1
new_dict[elem] = [(item1, item2, item3)]
break
Expected results should be:
new_dict = {'abc':[('abc', '1', '5'), ('def', '1', '5'), ('abcd', '2', '5')],
'ghi':[('ghi', '1', '5'), ('jkl', '1', '4'), ('mno', '2', '4')]}
Thanks in advance.
Upvotes: 0
Views: 532
Reputation: 1268
Others may have more efficient ways to do this, but here's one solution that incorporates your intuitions that you need to loop over old_dict
items and use a set:
for key, val in old_dict.items():
if val[0][0] not in set([v[0][0] for v in new_dict.values()]):
new_dict.update({key: val})
Here's a brief explanation of what's going on: First, val[0][0]
is the "informative flag" from your dictionary entry (i.e. the first item of the first tuple in the entry list). set([v[0][0] for v in new_dict.values()])
will give you the unique values of that flag in your new dictionary. The inner part is a list comprehension to get all the "flags" and then set
will give a unique list. The last line just uses the update
method to append to it.
REVISED ANSWER
@VinayPai raises two important issues below in the comments. First, this code is inefficient because it reconstructs the test set each time. Here's the more efficient way he suggests:
flag_list = set()
for key, val in old_dict.items():
if val[0][0] not in flag_list:
new_dict.update({key: val})
flag_list.add(val[0][0])
The second issue is that this will produce inconsistent results because dictionaries are not ordered. One possible solution is to use an OrderedDict. But as @SyntaxVoid suggests, this is only necessary if you're using Python3.5 or earlier (here is a great answer discussing the change). If you can create your data in this fashion, it would solve the problem:
from collections import OrderedDict
old_dict = OrderedDict{'abc':[('abc', '1', '5'), ('def', '1', '5'), ('abcd', '2', '5')],
'def':[('abc', '2', '5'), ('def', '1', '5'), ('abcd', '1', '5')],
'ghi':[('ghi', '1', '5'), ('jkl', '1', '4'), ('mno', '2', '4')]}
Upvotes: 1
Reputation: 1420
If i get you correctly, the following should do what you want:
flgset = set()
new_dict = {}
for k, tuple_list in old_dict.items():
# if the key is not in flgset, just keep the k, tuple_list pair
if k not in flgset:
new_dict[k] = tuple_list
# update the elements into flgset
# item in this case is ('abc', '2', '5'),
# since you only want to add the first element, use item[0]
for item in tuple_list:
flgset.add(item[0])
Output as such:
new_dict = {'abc': [('abc', '1', '5'), ('def', '1', '5'), ('abcd', '2', '5')],
'ghi': [('ghi', '1', '5'), ('jkl', '1', '4'), ('mno', '2', '4')]}
flgset = {'abc', 'abcd', 'def', 'ghi', 'jkl', 'mno'}
Upvotes: 2