Kevin
Kevin

Reputation: 3239

Python - remove non-unique elements between lists

I have a dictionary of lists of image ids that belong to a class of images such as dog and cat. Some of the images contain both dogs and cats in the image, and I want to remove those images.

Lets say I have

{'cat':[1,2,3], 'dog':[2,3,4]}

we can see that the images with id 2 and 3 have both cats and dogs. I want to exclude these images to get the following:

[[1],[4]]

I have tried this so far:

from collections import Counter
img_ids = {'cat':[1,2,3], 'dog':[2,3,4]}
flattened = [item for sublist in img_ids.values() for item in sublist]
flattened_unique = [k for k, v in dict(Counter(flattened)).items() if v < 2]
filtered_ids_dfs = []
for key, val in img_ids.items():
  filtered = [x for x in val if x in flattened_unique]
  filtered_ids_dfs.append(filtered)
print(filtered_ids_dfs)

Is there a better or more elegant solution to this? Also there may be an arbitrary number of classes, so our dictionary may have cat, dog, chicken etc.

Upvotes: 2

Views: 285

Answers (3)

Alexander
Alexander

Reputation: 109526

First, count how many objects (e.g. cat, dog) there are per image. Then find the images with only one object (unique images). Finally, use a dictionary comprehension to find images that are in the unique image list.

from collections import Counter

d = {'cat':[1,2,3], 'dog':[2,3,4], 'chicken': [2, 4, 5, 6]}

c = Counter([item for items in d.values() for item in items])
unique_images = set(k for k, count in c.iteritems() if count == 1)  # .items() in Python3

>>> {k: [item for item in items if item in unique_images] for k, items in d.iteritems()}  # .items() in Python3
{'cat': [1], 'chicken': [5, 6], 'dog': []}

Upvotes: 5

nimish
nimish

Reputation: 4992

Just use sets:

d = {'cat':[1,2,3], 'dog':[2,3,4]}
common = set(d['cat']) & set(d['dog'])
out = [list(set(d['cat']) - common), list(set(d['dog']) - common)]

Extending this to more than two keys:

common = set.intersection(*(set(v) for k,v in d.items()))
out = [list(set(v) - common) for k,v in d.items()]

Upvotes: 5

Ajax1234
Ajax1234

Reputation: 71451

You can use a list comprehension:

d = {'cat':[1,2,3], 'dog':[2,3,4]}
n = [[c for c in b if not any(c in h for j, h in d.items() if j != a)] for a, b in d.items()]

Output:

[[1], [4]]

Upvotes: 3

Related Questions