Reputation: 638
If Strings in Java are UTF-16 then UTF-16 character may have size of 4 bytes. So 1 UTF-16 character will have to map to 2 chars.
and this would mean that String length may be less than equivalent char[] length.
But that is not the case.
Character x = new Character((char) 7000);
String s = new String(""+x+x+x);
byte [] ar = s.getBytes();
char [] arr = s.toCharArray();
byte array has length 9.
char array has length 3.
so how can char have size of 2 bytes ?
So I think char in java may be larger than 2 bytes depending on the need is that correct .
If so what is the max size of char in java ? Or it is variable length and may go upto infinity in future ?
Upvotes: 0
Views: 1751
Reputation: 20520
The String.getBytes()
call doesn't return the UTF-16 internal representation. It returns the string in the platform's default encoding. In your case, that is quite likely to be UTF-8 (though, being a platform-determined thing, you'd need to check to be sure). The UTF-8 encoded form of (char)7000
(Unicode codepoint U+1B58 BALINESE DIGIT EIGHT
) is 3 bytes - E1 AD 98
. Hence your 9 bytes for 3 chars.
Upvotes: 5