Reputation: 3869
I'm attempting to convert a single byte to a string and then back again to the original byte. The assertion below fails however. Any suggestions would be deeply appreciated
import org.junit.Test;
import java.io.UnsupportedEncodingException;
import static org.junit.Assert.assertEquals;
public class ByteTest {
private static final String CHARSET = "UTF-8";
@Test
public void test() throws UnsupportedEncodingException {
byte b = (byte)(220);
String s = new String(new byte[]{b}, CHARSET);
byte[] parsed = s.getBytes(CHARSET);
assertEquals(b, parsed[0]); //fails
}
}
Upvotes: 1
Views: 1661
Reputation: 85341
Byte 220 (0xDC) by itself is invalid UTF-8. A UTF-8 character starting with a byte 0xA1..0xF5 requires a second byte.
Try another encoding, e.g. ISO-8859-1, which has effectively 1-to-1 byte-to-character round trip in Java.
public class ByteTest {
private static final String CHARSET = "ISO-8859-1";
@Test
public void test() throws UnsupportedEncodingException {
byte b = (byte) (220);
String s = new String(new byte[] { b }, CHARSET);
byte[] parsed = s.getBytes(CHARSET);
assertEquals(b, parsed[0]);
}
}
Upvotes: 3