Mark
Mark

Reputation: 55

Index of the first duplicate element

I need help with finding the index of the first duplicate element in array. I tried this but it gets me all indices of all duplicate elements.

res = [idx for idx, item in enumerate(input) if item in input[:idx]]

Upvotes: 4

Views: 774

Answers (4)

CrazyChucky
CrazyChucky

Reputation: 3518

tl;dr: use wim's or Mad Physicist's answers. Anything based on the technique you're starting with will do a lot of unnecessary looping. The following isn't so much a solution as some parallel ideas about how your current code could be improved.

The simplest modification to your code would be to simply access the first item of the resulting list:

res = [idx for idx, item in enumerate(values) if item in values[:idx]][0]

But that's kind of silly, since you'll still be traversing the entire list, finding all duplicates, then throwing away all but one. It will also throw a KeyError if the list is empty (that is, if there are no duplicates). A cleaner and more efficient method would be to call next on a version of your list comprehension turned into a generator expression:

res = next((idx for idx, item in enumerate(values) if item in values[:idx]), None)

None is supplied as a default which will be returned if no duplicates are found.

This still has the same efficiency issues that wim points out, but knowing about indexing, next, and generator expressions can be useful!

Upvotes: 2

Mad Physicist
Mad Physicist

Reputation: 114518

You can maintain a set of items you've seen so far:

def first_dup(a):
    seen = set()
    for i, e in enumerate(a):
        if e in seen:
            return i
        seen.add(e)
    return -1

You can isolate the set check into a separate function and combine the rest into an efficient one-liner using next:

def first_dup(a):
    seen = set()
    return next((i for i, e in enumerate(a) if (e in seen or seen.add(e))), -1)

This cheats by using the fact that or will evaluate the second expression any time the first is False, and set.add is an in-place operation that returns None.

Upvotes: 1

wim
wim

Reputation: 363456

Unfortunately your idea is O(n^2), because slicing and checking membership is O(n) and you do this within a loop. It's essentially a nested loop.

Instead, you can do this in O(n) using a single loop:

seen = {}
for i, item in enumerate(my_input):
    if item in seen:
        break
    seen[item] = i
else:
    i = item = None

At the end of the loop, item is the first duplicate item and i is its index. seen[item] will hold the index of the first occurrence of this duplicate item.

Upvotes: 4

FatemeZamanian
FatemeZamanian

Reputation: 154

If you want the first index in the iteration:

array=[2,3,4,5,3]
for i,a in enumerate(array):
    for j in range(i):
        if a==array[j]:
            print(j)
            exit()

If you want the last index in the iteration:

array=[2,3,4,5,3]
for i,a in enumerate(array):
    for j in range(i):
        if a==array[j]:
            print(i)
            exit()

Upvotes: 0

Related Questions