Reputation: 55

Remove some elements in one dictionary based on another

Suppose I got two dictionaries like this.

a  = {'file1': [('1.txt', 1.0),
                ('3.txt', 0.4),
                ('2.txt', 0.3)],
      'file2': [('1.txt', 0.5),
                ('2.txt', 0.2),
                ('3.txt', 1.0)]}

b =  {'file1': [('1.txt', 9),
                ('2.txt', 1),
                ('3.txt', 5),
                ('4.txt', 4)],
      'file2': [('1.txt', 0),
                ('2.txt', 2),
                ('3.txt', 3),
                ('4.txt', 0)]}

I wrote a function to filter the dictionary b base on dictionary a.

The expecting result of the function is like:

c =  {'file1': [('1.txt', 9),
                ('2.txt', 1),
                ('3.txt', 5)],
      'file2': [('1.txt', 0),
                ('2.txt', 2),
                ('3.txt', 3)]

So far I've wrote a function but it's output isn't that one I want.

def filter():
    c = {file1:set((txt1,value2)
               for file1,dic1 in a.items()
               for file2,dic2 in b.items()
               for txt1,value1 in dic1
               for txt2,value2 in dic2
               if txt1 == txt2 and file1 == file2)
         for file1,dic1 in a.items()}

    pp({k:v for k,v in c.items()})

The output now is shown below:

{'file1': {('1.txt', 0),
           ('1.txt', 9),
           ('2.txt', 1),
           ('2.txt', 2),
           ('3.txt', 3),
           ('3.txt', 5)},
 'file2': {('1.txt', 0),
           ('1.txt', 9),
           ('2.txt', 1),
           ('2.txt', 2),
           ('3.txt', 3),
           ('3.txt', 5)}}

I dont know where went wrong. Any help would be appreciated.

Upvotes: 2

Answers (3)

Padraic Cunningham

Reputation: 180391

If you only want to keep common keys with common values where you have uncommon keys:

print({k:[v for v in val if v[0] in {x[0] for x in a[k]}] for k, val in b.items() if k in a})

{'file2': [('1.txt', 0), ('2.txt', 2), ('3.txt', 3)], 'file1': [('1.txt', 9), ('2.txt', 1), ('3.txt', 5)]}

If you have uncommon keys and also want to keep those keys and values:

print({k:([v for v in val if v[0] in {x[0] for x in a[k]}] if k in a else val) for k, val in b.items()})

{'file2': [('1.txt', 0), ('2.txt', 2), ('3.txt', 3)], 'file1': [('1.txt', 9), ('2.txt', 1), ('3.txt', 5)]}

If you want to actually filter the original dict:

for k, val in b.items():
    b[k] = [v for v in val if v[0] in {x[0] for x in a[k]}]

print(b)

Or a dict comp to create a new dict if all keys are common:

print({k:[v for v in val if v[0] in {x[0] for x in a[k]}]  for k, val in b.items()})

{'file2': [('1.txt', 0), ('2.txt', 2), ('3.txt', 3)], 'file1': [('1.txt', 9), ('2.txt', 1), ('3.txt', 5)]}

filtering the original dict will be by far the most efficient.

Upvotes: 1

Pynchia

Reputation: 11590

I am new learner as well and my answer is not nearly as good as the one before, but since I spent some time solving the problem :) my code is

from collections import defaultdict

def f(data, flt):
    newflt = {}
    for k, v in flt.items():
        newflt[k] = map(lambda t: t[0], v)
    outd = defaultdict(list)
    for k, v in data.items():
        fv = newflt[k]
        for t in v:
            if t[0] in fv:
                outd[k].append(t)
    return outd

Upvotes: 1

Kasravnd

Reputation: 107287

You can use collections.defaultdict for such tasks :

>>> from collections import defaultdict
>>> d=defaultdict(list)
>>> for k,v in b.items():
...      for i in v:
...         if i[0] in zip(*a[k])[0]: #in python 3 next(zip(*a[k]))
...              d[k].append(i)
... 
>>> d
defaultdict(<type 'list'>, {'file2': [('1.txt', 0), ('2.txt', 2), ('3.txt', 3)], 'file1': [('1.txt', 9), ('2.txt', 1), ('3.txt', 5)]})

Note that for checking the existence of b values in a you can get the file names with zip function!

Also as a another way you can use dict.setdefault() method :

>>> c={}
>>> for k,v in b.items():
...      for i in v:
...         if i[0] in zip(*a[k])[0]:
...            c.setdefault(k,[]).append(i)
... 
>>> c
{'file2': [('1.txt', 0), ('2.txt', 2), ('3.txt', 3)], 'file1': [('1.txt', 9), ('2.txt', 1), ('3.txt', 5)]}

Note : if you are using python3 as the zip function return a generator you cannot index it so you need to change zip(*a[k])[0]: to next(zip(*a[k])):

Upvotes: 3

Remove some elements in one dictionary based on another

Answers (3)

Related Questions