Finger twist
Finger twist

Reputation: 3786

Returning unique elements from values in a dictionary

I have a dictionary like this :

d = {'v03':["elem_A","elem_B","elem_C"],'v02':["elem_A","elem_D","elem_C"],'v01':["elem_A","elem_E"]}

How would you return a new dictionary with the elements that are not contained in the key of the highest value ? In this case :

d2 = {'v02':['elem_D'],'v01':["elem_E"]}

Thank you,

Upvotes: 1

Views: 80

Answers (3)

J. Katzwinkel
J. Katzwinkel

Reputation: 1943

Depending on your python version, you may be able to get this done with only one line, using dict comprehension:

>>> d2 = {k:[v for v in values if not v in d.get(max(d.keys()))] for k, values in d.items()}
>>> d2
{'v01': ['elem_E'], 'v02': ['elem_D'], 'v03': []}

This puts together a copy of dict d with containing lists being stripped off all items stored at the max key. The resulting dict looks more or less like what you are going for. If you don't want the empty list at key v03, wrap the result itself in another dict:

>>> {k:v for k,v in d2.items() if len(v) > 0}
{'v01': ['elem_E'], 'v02': ['elem_D']}

EDIT: In case your original dict has a very large keyset [or said operation is required frequently], you might also want to substitute the expression d.get(max(d.keys())) by some previously assigned list variable for performance [but I ain't sure if it doesn't in fact get pre-computed anyway]. This speeds up the whole thing by almost 100%. The following runs 100,000 times in 1.5 secs on my machine, whereas the unsubstituted expression takes more than 3 seconds.

>>> bl = d.get(max(d.keys()))
>>> d2 = {k:v for k,v in {k:[v for v in values if not v in bl] for k, values in d.items()}.items() if len(v) > 0}

Upvotes: 0

PyNEwbie
PyNEwbie

Reputation: 4950

from collections import defaultdict
myNewDict = defaultdict(list)
all_keys = d.keys()
all_keys.sort()
max_value = all_keys[-1]
for key in d:
    if key != max_value:
        for value in d[key]:
            if value not in d[max_value]:
                myNewDict[key].append(value)

You can get fancier with set operations by taking the set difference between the values in d[max_value] and each of the other keys but first I think you should get comfortable working with dictionaries and lists.

defaultdict(<type 'list'>, {'v01': ['elem_E'], 'v02': ['elem_D']})

one reason not to use sets is that the solution does not generalize enough because sets can only have hashable objects. If your values are lists of lists the members (sublists) are not hashable so you can't use a set operation

Upvotes: 0

Aaron Hall
Aaron Hall

Reputation: 395913

I prefer to do differences with the builtin data type designed for it: sets.

It is also preferable to write loops rather than elaborate comprehensions. One-liners are clever, but understandable code that you can return to and understand is even better.

d = {'v03':["elem_A","elem_B","elem_C"],'v02':["elem_A","elem_D","elem_C"],'v01':["elem_A","elem_E"]}

last = None
d2 = {}
for key in sorted(d.keys()):
    if last:
        if set(d[last]) - set(d[key]):
            d2[last] = sorted(set(d[last]) - set(d[key]))
    last = key

print d2
{'v01': ['elem_E'], 'v02': ['elem_D']}

Upvotes: 1

Related Questions