user3440123
user3440123

Reputation: 35

How to find out if there are any duplicates in list of lists

So I'm taking an intro computer science course right now, and I was wondering how to check if there were any duplicates within multiple lists. I've read up on these answers:

How can I compare two lists in python and return matches and How to find common elements in list of lists?

However, they're not quite what I'm looking for. Say for example I have this list of lists:

list_x = [[66,76], 
          [25,26,27], 
          [65,66,67,68], 
          [40,41,42,43,44], 
          [11,21,31,41,51,61]]

There are two sets of duplicates (66 and 41), although that doesn't really matter to me. Is there a way to find if the duplicates exist? What I'm looking for is if there are duplicates, the function will return True (or False, depending on what I want to do with the lists). I get the impression that I should use sets (which we have not learned about so I looked up on the internet), use for loops, or write my own function. If it's the case that I'll need to write my own function, please let me know, and I'll edit with an attempt later today!

Upvotes: 1

Views: 318

Answers (3)

user2555451
user2555451

Reputation:

A very simple solution would be to use a list comprehension to first flatten the list and then afterwards use set and len together to test for any duplicates:

>>> list_x = [[66,76],
...           [25,26,27],
...           [65,66,67,68],
...           [40,41,42,43,44],
...           [11,21,31,41,51,61]]
>>> flat = [y for x in list_x for y in x]
>>> flat # Just to demonstrate
[66, 76, 25, 26, 27, 65, 66, 67, 68, 40, 41, 42, 43, 44, 11, 21, 31, 41, 51, 61]
>>> len(flat) != len(set(flat)) # True because there are duplicates
True
>>>
>>> # This list has no duplicates...
... list_x = [[1, 2],
...           [3, 4, 5],
...           [6, 7, 8, 9],
...           [10, 11, 12, 13],
...           [14, 15, 16, 17, 18]]
>>> flat = [y for x in list_x for y in x]
>>> len(flat) != len(set(flat)) # ...so this is False
False
>>>

Be warned however that this approach will be somewhat slow if list_x is large. If performance is a concern, then you can use a lazy approach which utilizes a generator expression, any, and set.add:

>>> list_x = [[66,76],
...           [25,26,27],
...           [65,66,67,68],
...           [40,41,42,43,44],
...           [11,21,31,41,51,61]]
>>> seen = set()
>>> any(y in seen or seen.add(y) for x in list_x for y in x)
True
>>>

Upvotes: 3

m.wasowski
m.wasowski

Reputation: 6386

Here is more straightforward solution with sets:

list_x = [[66,76], 
          [25,26,27], 
          [65,66,67,68], 
          [40,41,42,43,44], 
          [11,21,31,41,51,61]]
seen = set()
duplicated = set()
for lst in list_x:
    numbers = set(lst) # only unique
    # make intersection with seen and add to duplicated:
    duplicated |= numbers & seen 
    # add numbers to seen
    seen |= numbers

print duplicated

for information about set and its operations,see docs: https://docs.python.org/2/library/stdtypes.html#set

Upvotes: 0

Martijn Pieters
Martijn Pieters

Reputation: 1124558

Iterate and use a set to detect if there are duplicates:

seen = set()
dupes = [i for lst in list_x for i in lst if i in seen or seen.add(i)]

This makes use of the fact that seen.add() returns None. A set is a unordered collection of unique values; the i in seen test is True if i is already part of the set.

Demo:

>>> list_x = [[66,76], 
...           [25,26,27], 
...           [65,66,67,68], 
...           [40,41,42,43,44], 
...           [11,21,31,41,51,61]]
>>> seen = set()
>>> [i for lst in list_x for i in lst if i in seen or seen.add(i)]
[66, 41]

Upvotes: 1

Related Questions