Reputation: 9829
I found a java class that convert byte or char to hexadecimal value. But I cannot understand the code clearly. Can you explain what the code do or where I can find more resources about this?
public class UnicodeFormatter {
static public String byteToHex(byte b) {
// Returns hex String representation of byte b
char hexDigit[] = {
'0', '1', '2', '3', '4', '5', '6', '7',
'8', '9', 'a', 'b', 'c', 'd', 'e', 'f'
};
char[] array = {hexDigit[(b >> 4) & 0x0f], hexDigit[b & 0x0f]};
return new String(array);
}
static public String charToHex(char c) {
// Returns hex String representation of char c
byte hi = (byte) (c >>> 8);
byte lo = (byte) (c & 0xff);
return byteToHex(hi) + byteToHex(lo);
}
} // class
Upvotes: 3
Views: 3760
Reputation: 91349
First, let us start with some definitions:
char
, in Java, occupies 2 bytes;byte
is composed by 8 bits;Therefore, a byte
can be represented by 2 hexadecimal digits, i.e., two groups of 4 bits. This is exactly what is being done in the byteToHex
method: it firsts splits the byte in two groups of 4 bits, and then maps each into an hexadecimal symbol, using the hexDigit
array. Since the decimal value of each group of 4 bits can never be greater or equal than 16 (2^4
), each group will always have a mapping in the hexDigits
array.
For example, suppose you want to convert the number 29
to hexadecimal:
29
is represented in binary as 00011101
;00011101
in two groups of 4 bits yields 0001
and 1101
;0001
can be obtained by shifting away the least significant 4 bits (1101
) from the binary representation of 29
. Then, 0001
would become the first 4
bits. This is accomplished in Java with (b >> 4
);b & 0x0f
, which is equivalent to 00011101 & 00001111 = 00001101 = 1101
. By bit-AND
ing the binary number with 0x0f
you are clearing (setting to 0) everything except the least significant 4 bits.1
(0001
) and 13
(1101
), which are then mapped to 1
and D
respectively, in the hexadecimal system.29
is therefore represented by 1D
in hexadecimal.A similar logic can be applied to the method charToHex
. The only difference is instead of converting a single byte, you are converting 2, since a char
is 2 bytes.
Upvotes: 4
Reputation: 5094
Basically what it is doing here is the same as turning 23 into a string by changing it to 2*10+3, then turning 2 and 3 into characters.
To break it down, we first divide by 16, since we're working in hex.
b >> 4 means shift the bits 4 spaces, so
12345678 >> 4 = 00001234
then the value in positions 1234 get looked up in the hexDigit array.
Then we do a modulus operation, also known as getting the remainder. In the decimal example, this is finding the 3 by chopping off everything to the left. For binary, they are using AND here.
0x0f in bits is 00001111, so when ANDed with a byte, it will change the left 4 spaces into 0s, leaving only the right 4.
12345678 & 0x0f = 00005678
and again we look up the value in positions 5678 in the hexDigit array. Note that I'm using 1-8 as position markers, the actual data will be all 0s and 1s.
Edit: The second function does basically the same operation, it uses the same >>> and & functions to split the unicode char into bytes. It appears to be assuming the unicode character is 16 bits, so it shifts it 8 places to get the left 8 bits, and uses & 0xff to get the right 8 bits.
Upvotes: 2