Learner
Learner

Reputation: 2339

Bitshift - Need help to understand the code

I am just trying to learn bitwise / shift operations.

I came across the below program but don't understand the AND condition part (checker & (1 << val) in the below program. When will the final Value be greater than 0? Can someone please explain whats happening there?

Sample input: xyzz

Sample output:

8388608Value 0checker 0final value

16777216Value 8388608checker 0final value

33554432Value 25165824checker 0final value

33554432Value 58720256checker 33554432final value

public static boolean isUniqueChars(String str) {
        int checker = 0;
        for (int i = 0; i < str.length(); i++) {
            int val = str.charAt(i) - 'a';

            System.out.println((1 << val) + "Value");
            System.out.println((checker) + "checker");
            System.out.println(((checker & (1 << val))) + "final value\n");

            if ((checker & (1 << val)) > 0) {
                return false;
            } else {
                checker = checker | (1 << val);
            }
        }
        return true;
    }

}

Upvotes: 1

Views: 165

Answers (2)

Marcellus
Marcellus

Reputation: 1287

I think what your function does is check if all characters in the input string are different. It returns false iff the same (lower case) character appears more than once.

The checker variable serves as a kind of bit map that accumulates which characters have appeared so far. Data type int consists of 32 bits which is enough to assign one bit per each character (26).

The function loops over all characters of str. The row int val = str.charAt(i) - 'a'; assigns some sort of ordinal value to val depending on the character ('a' => 0, 'b' => 1, 'c' => 2, etc.).

The expression 1 << val assigns each val in the range of (0..25) to its bit position. Therefore, character 'a' is mapped to 1 << 0 == 1 == 00000001, character 'd' is mapped 1 << 3 == 00001000, and so on. Each character is assigned its unique bit mask with exactly one bit set and all other bits cleared.

The expression (checker & (1 << val)) is > 0 exactly iff the bit that is set in 1 << val is also set in checker (note that checker might have more than one bit set). If so, the currently iterated character has appeared earlier, and the function returns false. Otherwise, the bit mask of the current character is added via bitwise OR operator | to the checker that acts as an accumulator. If all characters have been looped over and no character has been met twice, the function returns true. Note that the function might ignore upper-case and other characters.

Upvotes: 1

Powerlord
Powerlord

Reputation: 88826

OK, just to make sure you know what's going on:

int val = str.charAt(i) - 'a';

Assuming the English alphabet, this is taking the char value for your (lowercase) letter and subtracting 97 (the char value for 'a') to produce a number between 0 and 25 inclusive. Don't try this function on uppercase characters, you'll get errors unless you add a .toLowerCase() after the .charAt(i)

1 << val is bit-shifting 1 val places to the left. For instance, for 'x' (120 - 97 = 23, so... 1 << 23), the binary representation would be 00000000010000000000000000000000

OK, with me so far?

At the start, checker has all 0 bits, so it's 00000000000000000000000000000000

So... lets put in our numbers instead of our variables. For our x check, checker & (1 << val) becomes 00000000000000000000000000000000 & 00000000010000000000000000000000 which equals 00000000000000000000000000000000 because bit 23 isn't set in checker.

So, once x is processed, we add bit 23 to checker and move on to the next letter: y This time, checker & (1 << val) becomes 00000000010000000000000000000000 & 00000000100000000000000000000000 which equals 00000000000000000000000000000000 because bit 24 isn't set in checker.

For the first z, checker & (1 << val) becomes 00000000110000000000000000000000 & 00000001000000000000000000000000 which equals 00000000000000000000000000000000 because bit 25 isn't set in checker.

For the second z, checker & (1 << val) becomes 00000001110000000000000000000000 & 00000001000000000000000000000000 which equals 00000001000000000000000000000000 (decimal 33554432 or 2^25) because bit 25 is set in checker, therefore the > 0 is now true and the function returns false.

Upvotes: 6

Related Questions