Reputation: 734
I need to implement a method like this: int toCodePoint(byte [] buf, int startIndex); It should decode a UTF-8 char in byte array to code point. No extra objects should be created(that's the reason why I don't use JDK String class to do decode). Are there any existing java classes to do this? Thank you.
Upvotes: 2
Views: 3059
Reputation: 41498
You can use java.nio.charset.CharsetDecoder to do that. You'll need a ByteBuffer
and a CharBuffer
. Put the data into ByteBuffer
, then use CharsetDecoder.decode(ByteBuffer in, CharBuffer out, boolean endOfInput)
to read into the CharBuffer
. Then you can get the code point using Character.codePointAt(char[] a, int index)
. It is important to use this method because if your text has characters outside the BMP, they will be translated into two chars, so it's not sufficient to read only one char.
With this method you only need to create two buffers once, after that no new objects will be created unless some error occurs.
Upvotes: 4
Reputation: 2208
All existing Java classes i know are not fits for this task, because you have restriction ("No extra objects should be created"). Otherwise you could use CharsetDecoder (as mentioned by Malcolm). Or even come to dark side and use sun.io.ByteToCharUTF8 if you really need pure static method. But it is not recommended way.
Upvotes: 0