Corey
Corey

Reputation:

Explain this use of XOR in this function?

I saw this code snippet that was used to solve the ol' "Find the one number in an array that does not have a duplicate." question. I have been looking at this for a while this morning, but can not nail down exactly how it is doing it.

I don't understand how k always ends up holding the value of the non-duplicate. Can anyone explain how this works?

  static void Main(string[] args)
  {
     int[] list = { 3,6,9,12,3,6,9 };
     int k = 0;


     for (int i = 0; i < list.Length; i++)
     {
        k = k ^ list[i];
     }

     Console.WriteLine(k);
  }

Upvotes: 6

Views: 598

Answers (8)

KeithS
KeithS

Reputation: 71565

It's slightly similar to the "Nerds, Jocks and Lockers" problem, in terms of the "bit-flipping" going on leaving certain bits set and others not.

The basic behavior is that A XOR B works like "(A OR B) AND NOT (A AND B)". So, 0^0=0, 1^0 = 1, but 1^1 = 0 (unlike OR). Now, you start with zero (no bits set) on K. You then XOR it with the literal 3, which (as a byte) has the bits 00000011, and assign the result to K. You end up with 00000011 for K, because the bits that are set on the literal 3 are all not set on K when it's 0. Now, if you were to XOR K with the literal 3 again, you'd end up with 0, because all the bits match between the two values, so the XOR would return 0 on each bit.

This process is commutative, so ((((0 XOR 3) XOR 6) XOR 3) XOR 6) would give the same result (0) as ((((0 XOR 6) XOR 6) XOR 3) XOR 3), or pretty much any other combination of XORing 0 with 3 twice and 6 twice.

The net result is that, given a list of these numbers, any number that occurs twice (or an even number of times) is "XORed in" to K the first time, and then "XORed out" the second, leaving K with its bits set to the one value that only occurred once; 12.

Here's the binary demonstration of the full problem (using "nibbles" because we don't have any values over 16):

0000 0
^^^^ XOR
0011 3
---- =
0011 3
^^^^ XOR
0110 6
---- =
0101 5
^^^^ XOR
1001 9
---- =
1100 12
^^^^ XOR
1100 12
---- =
0000 0 <-this is coincidence; it'd work the same regardless of the unduped value
^^^^ XOR
0011 3
---- =
0011 3
^^^^ XOR
0110 6
---- =
0101 5
^^^^ XOR
1001 9
---- =
1100 12 <- QED

EDIT FROM COMMENTS: While this answer functions for the specific question asked, even the smallest change to the problem would "break" this implementation, such as:

  • The algorithm is completely unresponsive to the number zero; the algorithm is thus unable to tell the difference between having zero as a single unpaired value and having no unpaired values at all.
  • The algorithm only works for pairs, not triplets. If 3 occurred three times and was still a "dupe", and 12 was still the right answer, this algorithm would actually return 15 (1100 ^ 0011 == 1111) which isn't even in the list.
  • The algorithm only works if there is only one non-duplicated value in the list; if 8 and 12 were both unpaired values expected to be returned, the algorithm would return the XOR of the two (1100 ^ 1000 == 0100 == 4)

An efficient algorithm could be developed that would return the correct answer in all these cases in addition to the original case, but it would likely not involve XOR.

Upvotes: 7

Kasey Speakman
Kasey Speakman

Reputation: 4631

The code you reference does NOT actually solve the problem. For instance, put three 12s in the list and you will get 12, even though 12 is duplicated twice.

The result of XOR is essentially whether there is an odd or even number of that bit combination. So if the list contains an odd number of 12s (whether that be one 12 or a hundred-and-one 12s) and an even number of each other number, the result will always be 12.

Even worse, if the list contained an odd number of multiple different numbers, the result value may not be any of the numbers in the list. E.g., one 3 and one 14 would result in 13. This is because the XOR really operates on bits, not integers. The XOR of 14 (1110b) and 3 (0011b) results in 13 (1101b), since XOR sets any bits that are common between the numbers to zero.

Code that actually solves the problem:

using System.Linq;
using System.Collections.Generic;
...
static void Main ( string[] args)
{
    int[] list = { 3,6,9,12,3,6,9 };
    int[] nonDupes = list
        .GroupBy(x => x)
        .Where(x => x.Count() == 1)
        .Select(x => x.Key)
        .ToArray();
    string output = string.Join(",", nonDupes);
    Console.WriteLine(output);
}

Upvotes: 0

Farid Farhat
Farid Farhat

Reputation: 2310

When using XOR, you have to put in mind that you will be working with bits. In other words, you will only have 0 and 1 to deal with.

The definition of XOR is so simple:

0 XOR 0 -> 0
0 XOR 1 -> 1
1 XOR 1 -> 0

So if you apply this to your code and transform the 3 to 0011, the 6 to 0110, the 9 to 1001 and the 12 to 1100, and invoke the XOR method on them like in the example below, you will end up with 1100 that will represent the value 12.

Example:

    0011
XOR
    0110
=
    0101 -> 5

This is not an algorithm to eliminate duplicate values. This happened by coincidence that you obtained the 12.

Upvotes: 0

Elian Ebbing
Elian Ebbing

Reputation: 19027

This works because the XOR operator is both commutative and associative. This means you can rearrange the terms in any order:

a ^ b ^ c == a ^ c ^ b == c ^ a ^ b == ... (etc.)

This works for sequences of any length. Therefore:

3 ^ 6 ^ 9 ^ 12 ^ 3 ^ 6 ^ 9 == (3 ^ 3) ^ (6 ^ 6) ^ (9 ^ 9) ^ 12

Since x ^ x == 0 for any integer, which means you can eliminate all the pairs until you get 0 ^ 12, which is just equal to 12.

Upvotes: 1

acjay
acjay

Reputation: 36511

keithS is spot on, but this might be a bit easier to follow.

The simplest explanation I can think of has to do with four properties of XOR:

  1. 0 XOR x = x, for any number x
  2. x XOR x = 0, for any number x
  3. XOR is commutative: x XOR y = y XOR x (this allows you to rearrange a series of XOR operations however you see fit)
  4. XOR is associative: (x XOR y) XOR z = x XOR (y XOR z) (this allows you to evaluate XOR's in any order you see fit)

By properties 2 and 3, you can rearrange your input list so that all duplicates are next to each other:

{ 3,6,9,12,3,6,9 } -> { 3, 3, 6, 6, 9, 9, 12 };

By properties 1 and 4, we can XOR these numbers pairwise in any order, and all pairs of identical numbers become 0. After this, only the unduplicated number remains non-zero:

{ 0, 0, 0, 12 };

Also by property 1, because k is 0 to begin with, and all duplicate numbers have XOR'ed to become zero, all that's left is 12

Upvotes: 1

dagnelies
dagnelies

Reputation: 5321

Any number can be expressed as a bit sequence:

3 == 0...00011
6 == 0...00110
9 == 0...01001

If you XOR them twice, the bit switching will cancel each other.

Therefore, if a single number appears once (or an odd number of times) in the list, then it will be the only one with the bits left "uncancelled".

Upvotes: 1

Puppy
Puppy

Reputation: 146910

a ^ a = 0
a ^ 0 = a
0 ^ a = a ^ 0
a = b ^ c
a = c ^ b

int[] list = { 3,6,9,12,3,6,9 };

k ^ 3
k ^ 6
k ^ 9
k ^ 12
k ^ 3
k ^ 6
k ^ 9

=

k ^ 3 // dupe
k ^ 3 
k ^ 6
k ^ 6
k ^ 9
k ^ 9
k ^ 12 // non-dupe

=

k ^ 12

=

0 ^ 12

=

12

Upvotes: 0

Paul Tomblin
Paul Tomblin

Reputation: 182772

It only works if only one number is non-duplicated (or occurs any odd number of times) and all the other numbers occur an even number of times.

When you xor a number to another number twice (or any other even number of times) it cancels itself out leaving the original number.

Upvotes: 12

Related Questions