Andrew Smith
Andrew Smith

Reputation: 571

Using XOR operator to determine if there are duplicates in a list of integers

I have a list of numbers:

a = [1,2,3,4,5,19,22,25,17,6,73,72,71,77,899,887,44,124, ...]
#this is an abbreviated version of the list

I need to determine if there are duplicates in the list or not using the XOR ("^") operator.

Can anyone give me any tips? I'm a newbie and have never encountered this problem or used the XOR operator before.

I've tried several approaches (which have amounted to blind stabs in the dark). The last one was this:

MyDuplicatesList = [1,5,12,156,166,2656,6,4,5,9] #changed the list to make it easer
for x in MyDuplicatesList:
    if x^x:
    print("True")

I realize I'm probably violating a protocol by asking such an open-ended question, but I'm completely stumped.

Upvotes: 1

Views: 6782

Answers (4)

Organis
Organis

Reputation: 7316

Why XOR?

# true if there are duplicates
print len(set(a)) != len(a)

Ok, THIS is pythonic. It finds all the duplicates and makes a list of them.

a = [1,2,3,4,5,19,22,25,17,6,73,72,71,77,899,887,44,124,1]
b = [a[i] for i in range(len(a)) for j in range(i+1,len(a)) if i ^ j > 0 if a[i] ^ a[j] < 1]

print b

Simplified version of Dalen's idea:

a = [1,2,3,4,5,19,22,25,17,6,73,72,71,77,899,887,44,124,1]

def simplifyDalen(source):
    dup = 0

    for x in source:
        for y in source:
            dup += x ^ y == 0

    return dup ^ len(source) > 0

result = simplifyDalen(a)

if result:
    print "There are duplicates!"
else:
    print "There are no duplicates!"

So far my bit index idea is the fastest (because it's one pass algorithm, I guess, not many to many)

Upvotes: 3

Dalen
Dalen

Reputation: 4236

When you xor two same numbers, you get 0. As you should know.

from operator import xor

def check (lst):
    dup = 0
    for x in lst:
        for y in lst:
            dup += xor(x, y)!=0
    l = len(lst)
    return dup!=(l**2 -l)

c = check([0,1,2,3,4,5,4,3,5])

if c:
    print "There are duplicates!"
else:
    print "There are no duplicates!"

BTW, this is extremely stupid way to do it. XORing is fast, but O(n**2) (always through whole set) is unnecessary loss. For one thing, the iteration should be stopped when first duplicate is encountered. For another, this really should be done using set() or dict(). But you can get the idea.

Also, using xor() function instead of bitwise operator '^' slows the thing down a bit. I did it for clarity as I complicated the rest of the code. And so that people know that it exists as an alternative.

Here is an example on how to do it better. It's a slight modification of code suggested by Organis in comments.

def check (lst):
    l = len(lst)
    # I skip comparing first with first and last with last
    # to prevent 4 unnecessary iterations. It's not much, but it makes sense.
    for x in xrange(1, l):
        for y in xrange(l-1):
            # Skip elements on same position as they will always xor to 0  :D
            if x!=y: # Can be (if you insist): if x^y != 0:...
                if (lst[x] ^ lst[y])==0:
                    return 1 # Duplicate found
    return 0 # No duplicates

c = check([0,1,2,3,4,5,4,3,5])

if c:
    print "There are duplicates!"
else:
    print "There are no duplicates!"

To repeat, XOR is not to be used for comparison, at least not in high-level programming languages. It's possible you would need similar thing in assembly for some reason, but eq works nicely there as anywhere. If we simply used == here instead, we would be able to check for duplicates in lists containing anything, not just integers.

XOR is fancy for other uses, as for ordinary bitwising (masking, unmasking, changing bits...), so in encryption systems and similar stuff.

Upvotes: 3

Organis
Organis

Reputation: 7316

Well, lets use list items as bit indices:

def dupliXor(source):
    bigs = 0

    for x in source:
        b = 1<<x
        nexts = bigs ^ b

        # if xor removes the bit instead
        # of adding it then it is duplicate
        if nexts < bigs:
            print True
            return

        bigs = nexts

    print False

a = [1,2,3,4,5,19,22,25,17,6,73,72,71,77,899,887,44,124,1]

dupliXor(a) # True

a = [1,2,3,4,5,19,22,25,17,6,73,72,71,77,899,887,44,124]

dupliXor(a) # False

Upvotes: 1

Vedranh13
Vedranh13

Reputation: 76

XOR takes the binary repersentation of a number and then compares it bit-by-bit with another and outputs a 1 if the two bits are different and a 0 otherwise. Example: 1 ^ 2 = 3 because in binary 1 is 01 and 2 is 10 so comparing bit-by-bit we get 11 or 3. XORing a number with itself always gives 0, so we can use this property to check if two numbers are the same. There are myriad better ways to check for duplicates in a list than using XOR, but if you want / need to do it this way, hopefully the above information gives you an idea of where to start.

Upvotes: 1

Related Questions