qwerty
qwerty

Reputation: 3869

Convert single byte to string and back to byte

I'm attempting to convert a single byte to a string and then back again to the original byte. The assertion below fails however. Any suggestions would be deeply appreciated

import org.junit.Test;
import java.io.UnsupportedEncodingException;
import static org.junit.Assert.assertEquals;

    public class ByteTest {
        private static final String CHARSET = "UTF-8";
        @Test
        public void test() throws UnsupportedEncodingException {
            byte b = (byte)(220);
            String s = new String(new byte[]{b}, CHARSET);
            byte[] parsed = s.getBytes(CHARSET);
            assertEquals(b, parsed[0]); //fails
        }
    }

Upvotes: 1

Views: 1661

Answers (1)

rustyx
rustyx

Reputation: 85341

Byte 220 (0xDC) by itself is invalid UTF-8. A UTF-8 character starting with a byte 0xA1..0xF5 requires a second byte.

Try another encoding, e.g. ISO-8859-1, which has effectively 1-to-1 byte-to-character round trip in Java.

public class ByteTest {
    private static final String CHARSET = "ISO-8859-1";

    @Test
    public void test() throws UnsupportedEncodingException {
        byte b = (byte) (220);
        String s = new String(new byte[] { b }, CHARSET);
        byte[] parsed = s.getBytes(CHARSET);
        assertEquals(b, parsed[0]);
    }
}

Upvotes: 3

Related Questions