jtor
jtor

Reputation: 133

check for duplicates in a python list

I've seen a lot of variations of this question from things as simple as remove duplicates to finding and listing duplicates. Even trying to take bits and pieces of these examples does not get me my result.

My question is how am I able to check if my list has a duplicate entry? Even better, does my list have a non-zero duplicate?

I've had a few ideas -

#empty list
myList = [None] * 9 

#all the elements in this list are None

#fill part of the list with some values
myList[0] = 1
myList[3] = 2
myList[4] = 2
myList[5] = 4
myList[7] = 3

#coming from C, I attempt to use a nested for loop
j = 0
k = 0
for j in range(len(myList)):
    for k in range(len(myList)):
        if myList[j] == myList[k]:
            print "found a duplicate!"
            return

If this worked, it would find the duplicate (None) in the list. Is there a way to ignore the None or 0 case? I do not care if two elements are 0.

Another solution I thought of was turn the list into a set and compare the lengths of the set and list to determine if there is a duplicate but when running set(myList) it not only removes duplicates, it orders it as well. I could have separate copies, but it seems redundant.

Upvotes: 2

Views: 24829

Answers (7)

CodedCuber
CodedCuber

Reputation: 1

In my opinion, this is the simplest solution I could come up with. this should work with any list. The only downside is that it does not count the number of duplicates, but instead just returns True or False

for k, j in mylist:
    return k == j

Upvotes: -2

jpp
jpp

Reputation: 164613

You can use collections.defaultdict and specify a condition, such as non-zero / Truthy, and specify a threshold. If the count for a particular value exceeds the threshold, the function will return that value. If no such value exists, the function returns False.

from collections import defaultdict

def check_duplicates(it, condition, thresh):
    dd = defaultdict(int)
    for value in it:
        dd[value] += 1
        if condition(value) and dd[value] > thresh:
            return value
    return False

L = [1, None, None, 2, 2, 4, None, 3, None]

res = check_duplicates(L, condition=bool, thresh=1)  # 2

Note in the above example the function bool will not consider 0 or None for threshold breaches. You could also use, for example, lambda x: x != 1 to exclude values equal to 1.

Upvotes: 0

Padraic Cunningham
Padraic Cunningham

Reputation: 180391

To remove dups and keep order ignoring 0 and None, if you have other falsey values that you want to keep you will need to specify is not None and not 0:

print [ele for ind, ele in enumerate(lst[:-1]) if ele not in lst[:ind] or not ele] 

If you just want the first dup:

for ind, ele in enumerate(lst[:-1]):
    if ele in lst[ind+1:] and ele:
        print(ele)
        break

Or store seen in a set:

seen = set()
for  ele in lst:
    if ele in seen:
        print(ele)
        break
    if ele:
        seen.add(ele) 

Upvotes: 1

Malik Brahimi
Malik Brahimi

Reputation: 16711

If you simply want to check if it contains duplicates. Once the function finds an element that occurs more than once, it returns as a duplicate.

my_list = [1, 2, 2, 3, 4]

def check_list(arg):
    for i in arg:
        if arg.count(i) > 1:
            return 'Duplicate'

print check_list(my_list) == 'Duplicate' # prints True

Upvotes: 3

rchang
rchang

Reputation: 5236

I'm not certain if you are trying to ascertain whether or a duplicate exists, or identify the items that are duplicated (if any). Here is a Counter-based solution for the latter:

# Python 2.7
from collections import Counter

#
# Rest of your code
#

counter = Counter(myList)
dupes = [key for (key, value) in counter.iteritems() if value > 1 and key]
print dupes

The Counter object will automatically count occurances for each item in your iterable list. The list comprehension that builds dupes essentially filters out all items appearing only once, and also upon items whose boolean evaluation are False (this would filter out both 0 and None).

If your purpose is only to identify that duplication has taken place (without enumerating which items were duplicated), you could use the same method and test dupes:

if dupes:  print "Something in the list is duplicated"

Upvotes: 2

ericmjl
ericmjl

Reputation: 14684

Here's a bit of code that will show you how to remove None and 0 from the sets.

l1 = [0, 1, 1, 2, 4, 7, None, None]

l2 = set(l1)
l2.remove(None)
l2.remove(0)

Upvotes: -2

paolo
paolo

Reputation: 2538

Try changing the actual comparison line to this:

if myList[j] == myList[k] and not myList[j] in [None, 0]:

Upvotes: 2

Related Questions