Questions about an XOR article

Question

https://florian.github.io/xor-trick/

There's a part in the article that reads (the first step operates on (x,y))

x ^= y # => (x ^ y, y)

y ^= x # => (x ^ y, y ^ x ^ y) = (x ^ y, x)

x ^= y # => (x ^ y ^ x, x) = (y, x)

I would have thought second line would be

y ^= x # =>(x ^ y ^ x, y ^ x) = (y, y ^ x)

then third line

x ^= y # => (y, y ^ x ^ y) = (y, x)

If that part of the article is correct and I'm in error about how it should work, any tips about what I'm missing?

Later,the article has

If we analyze the individual bits in u ^ v, then every 0 means that the bit had the same value in both u and v. Every 1 means that the bits differed.

Using this, we find the first 1 in u ^ v, i.e. the first position i where u and v have to differ. Then we partition A as well as the numbers from 1 to n according to that bit. We end up with two partitions, each of which contains two sets:
Partition 0
    The set of all values from 1 to n where the i-th bit is 0
    The set of all values from A where the i-th bit is 0
Partition 1
    The set of all values from 1 to n where the i-th bit is 1
    The set of all values from A where the i-th bit is 1
Since u and v differ in position i, we know that they have to be in different partitions.

Let me see if I get this. Partition 0 contains one of u or v, we don't know which. Partition 1 contains v or u, whichever wasn't in Partition 0. We operate each with ^u^v to get u and v respectively. So yes, the array has to be partitioned into two parts, and I think I understand the basis for the partition and why it has to be done (to later isolate u and v after operating ^u^v).

But why is it noted that each partition contains two sets? I'm assuming the ith bit is the first "1" bit in u^v. It wouldn't be enough to partition the array into two parts, one partition with ith bit being 0 and one partition with ith bit being 1? 🤔 What's the significance of the set of all values from 1 to n where th i-th bit is 0, or 1, respectively?

Or is it not significant that the partitions each have two sets, and the sets are just a leftover byproduct of how the partitions were determined?

Thanks for any answers.

Onno Rouast · Accepted Answer

You ask two related questions, I'll answer both:

First, in this XOR-int-swap code the commented x and y do not stand for the current variable values, rather they stand for the original values of those variables

x ^= y # => (x ^ y, y)
y ^= x # => (x ^ y, y ^ x ^ y) = (x ^ y, x)
x ^= y # => (x ^ y ^ x, x) = (y, x)

The first operation x ^= y sets x to original_x ^ original_y.

The second operation y ^= x sets y to original_y ^ original_x ^ original_y, which is just original_x.

The third operation x ^= y sets x to original_x ^ original_y ^ original_x, which becomes original_y.

The second part is about the question How can we use XOR to find two missing values in a range/sequence of numbers?.

Your understanding is correct that by using XOR on both the sequence (without missing numbers) and the array (with two missing numbers), our final result is just the XOR of the two missing numbers, since doing XOR with the same number twice cancels itself out.

Since the two missing numbers have to be different numbers, they will differ at some index i, at which their XOR will be 1.

Once we have this index we can just run the algorithm that checks for a single missing number, simply by partitioning the numbers on whether they have the i-th bit set or not. And because the two numbers differ on the i-th bit, each partition will have exactly one of the missing numbers.

why is it noted that each partition contains two sets

The word 'set' might be the confusing part here, what is meant is "a set of numbers that was XORed with", the general idea is that:

we create a variable a = 0
we XOR a with all numbers in the sequence that have the i-th bit set (this is the first 'set')
we XOR a with all numbers in our array that have the i-th bit set (this is the second 'set')

Again the XOR operations cancel each other out, just for the one missing number we executed the XOR only once, so a is 0 ^ missing_number, which is missing_number.

Find a detailed explanation of this method in this answer here.

Questions about an XOR article

Answers (1)

Related Questions