Reputation: 87

Count number of subsets having cumulative XOR less than k

Given a set S of numbers of size N. Count all subsets of S which has cumulative XOR of elements of subset is less that K.

I could think of brute force approach to generate all subsets of S and count subsets which have cumulative XOR elements less than k. I am looking for optimized solution without generating all subsets of S, I can find all such subsets

Example: 
S = {1,2}
K = 4
U = {{},{1},{2},{1,2}}
Answer is 4 As 
cumulative XOR values are 0 for {}, 1 for {1}, 2 for {2}, 3 for {1,2}.

Upvotes: 0

Answers (2)

Master Chief

Reputation: 2541

The problem is very similar to count of subsets having sum equal to k. We can proceed in similar manner and total the count of subsets having sum equal to 0 to k.

$totalCount = \sum_{i=1}^{k-1} count\ of\ subset\ having\ sum\ i$

Below is my python implementation for the same.

It uses dynamic programming to store some intermediate results in each cell of DP table. Cell dp[i][j] contains the count of subsets equal to j which can be formed using first ith numbers in the sorted array.

Time Complexity O(n * maxXor), where maxXor is the maximum value which can be achieved by xoring any of the numbers in the array. At max maxXor will be equal to smallest power of 2 larger than maxValue present in array and K

from math import floor, log


arr = [1, 2]
K = 4


def getCoundDp(arr, k):
    arr.sort()
    maxVal = arr[-1]
    maxXor = 2**(floor(log(max(maxVal, k), 2)) + 1)
    dp = [[0 for i in range(maxXor)] for a in arr]
    dp[0][0] = 1
    # in the 1st row, mark the arr[0] to have count 1
    dp[0][arr[0]] = 1
    for row in range(1, len(arr)):
        for col in range(maxXor):
            dp[row][col] += dp[row-1][col]
            neededXor = col ^ arr[row]
            dp[row][col] += dp[row-1][neededXor]
    return sum(dp[-1][:k])


print(getCoundDp(arr, K))

Your suggestion of generating and checking all subsets would be very slow O(2^n). But should still be valuable atleast to verify the faster implementations. Below is a sample of brute force method in python using itertools.combination You can read more about it here.

from itertools import combinations


def getXor(arr):
    xor = 0
    for i in arr:
        xor ^= i
    return xor


def getCountBruteForce(arr, k):
    arr.sort()
    countLessThanK = 0
    for r in range(0, len(arr)+1):
        for comb in combinations(arr, r):
            xor = getXor(comb)
            if xor < k:
                countLessThanK += 1
    return(countLessThanK)

Upvotes: 1

eldar

Reputation: 413

Depending on the constraints of the problem, there are several viable ways to approach it:

As you pointed out, if N is small enough, trying all possible subsets in O(2^N) will yield the intended result.
If the values in S are bounded by some small enough value, you can use a dynamic programming solution outlined in Master Chief's post.
If both N is high, and values in S are large (billions and above), there is a more complicated, but polynomial-time approach you can apply. The outline is as follows:

Let's look at binary representations of the numbers in S, for example for numbers 17, 5, 20, 14 it would be:

And same for K, for example let's take K=11: 01011 11

If we wanted to count how many subsets XOR to exactly K, we could represent the problem as a system of linear equations modulo 2, with as many variables as there are numbers in S, and as many equations as there are meaningful bits in our numbers. More specifically, let the i-th equation represent the constraint "the XOR of the i-th bits of numbers in the subset of S should be equal to the i-th bit in K". (Note that XOR operation is equivalent to sum modulo 2). For example, for the smallest (rightmost) bit, we have the following: x1 * 1 + x2 * 1 + x3 * 0 + x4 * 0 = 1 (mod 2), where x_j is either 0 or 1, depending on whether we include the j-th number in our subset or not.

Note that this system of equations might have 0, 1, or many solutions. In the case of many solutions, each of the independent variables can take on either 0 or 1, and thus the number of solutions is 2^(independent variables).

We can detect the number of independent variables and solvability of a system of linear equations using Gaussian elimination, which runs in O(n^3) for a square matrix of size n - in your case the matrix isn't square, so we can use the larger of (|S|, log(max(S)) to estimate complexity.

Great, now we could go over all K' from 0 to K-1, solve the problem separately, and sum the results. However, that's nowhere better than the dynamic programming solution and is only pseudo-polynomial in runtime. Let's make another observation that will yield a polynomial solution: we are only interested in O(logK) different systems of equations to compute how many subsets XOR to less than K.

Let's denote the highest non-zero bit position in K as B. If all of bits above B, and bit B, are equal to 0 in the XOR of the subset we take, then obviously it's going to be less than K. Thus, our first system of equations can be written for only the abovementioned bits, and ignore everything below B.

Now let's look at what happens if we allow B-th bit to be equal to 1. If in number K there are one or more zero-bits following B-th position, they all have to be 0 in the resulting XOR as well. If the first subsequent non-zero bit B2 is set to 0 in our XOR, then it's going to be smaller than K. We can encode this observation as a second system of equations by saying "all bits above B are 0, bit B is 1, all the bits between B and B2 are 0, bit B2 is 0" and compute the number of solutions to it.

If we continue like this until the smallest binary position in K, we will need to build at most logK systems of equations, and obtain the result we wanted.

The complexity of this approach is something like O(logK * max(n, logK)^3), although depending on implementation Gaussian elimination can work much faster for non-square matrices.

Upvotes: 4

Count number of subsets having cumulative XOR less than k

Answers (2)

Related Questions