Sum of bit differences among all pairs

Question

The problem statement is the following:

Given an integer array of n integers, find sum of bit differences in all pairs that can be formed from array elements. Bit difference of a pair (x, y) is count of different bits at same positions in binary representations of x and y. For example, bit difference for 2 and 7 is 2. Binary representation of 2 is 010 and 7 is 111 ( first and last bits differ in two numbers).

Examples:

Input: arr[] = {1, 2}
Output: 4
All pairs in array are (1, 1), (1, 2)
                       (2, 1), (2, 2)
Sum of bit differences = 0 + 2 +
                         2 + 0
                      = 4

Based on this post the most efficient (running time of O(n)) solution is the following:

The idea is to count differences at individual bit positions. We traverse from 0 to 31 and count numbers with i’th bit set. Let this count be ‘count’. There would be “n-count” numbers with i’th bit not set. So count of differences at i’th bit would be “count * (n-count) * 2″.

// C++ program to compute sum of pairwise bit differences
#include 
using namespace std;

int sumBitDifferences(int arr[], int n)
{
    int ans = 0;  // Initialize result

    // traverse over all bits
    for (int i = 0; i < 32; i++)
    {
        // count number of elements with i'th bit set
        int count = 0;
        for (int j = 0; j < n; j++)
            if ( (arr[j] & (1 << i)) )
                count++;

        // Add "count * (n - count) * 2" to the answer
        ans += (count * (n - count) * 2);
    }

    return ans;
}

// Driver prorgram
int main()
{
    int arr[] = {1, 3, 5};
    int n = sizeof arr / sizeof arr[0];
    cout << sumBitDifferences(arr, n) << endl;
    return 0;
}

What I'm not entirely clear on is how the running time would be linear when there are two for loops incrementing by 1 for each iteration. The way I'm interpreting it is that since the outer loop is iterating from 0 to 32 (corresponding to the 0th and 32nd bits of each number) and because I'm guessing all 32 bit shifts would happen in the same clock period (or relatively fast compared to linear iteration), the overall running time would be dominated by the linear iteration over the array.

Is this the correct interpretation?

Trevor Merrifield · Accepted Answer

In English, "My algorithm runs in O(n) time" translates to "My algorithm runs in time that is at most proportional to n for very large inputs". The proportionality aspect of that is the reason that 32 iterations in an outer loop don't make any difference. The execution time is still proportional to n.

Let's look at a different example:

for (int i=0; i



In this example the execution time is proportional to n² so it's not O(n). It is however O(n²). And technically O(n³) and O(n⁴), ... as well. This follows from the definition.

There's only so much you can talk about this stuff in English without misinterpretation, so if you want to nail down the concepts you're best off checking out the formal definition in an introductory algorithms textbook or online class and working out a few examples.

Sum of bit differences among all pairs

Answers (2)

Related Questions