Reputation: 836
I wanted to use DataOutputStream#writeBytes, but was running into errors. Description of writeBytes(String)
from the Java Documentation:
Writes out the string to the underlying output stream as a sequence of bytes. Each character in the string is written out, in sequence, by discarding its high eight bits.
I think the problem I'm running into is due to the part about "discarding its high eight bits". What does that mean, and why does it work that way?
Upvotes: 6
Views: 595
Reputation: 15418
The char
data type is a single 16-bit
Unicode character. It has a minimum value of '\u0000'
(or 0) and a maximum value of '\uffff'
(or 65,535
inclusive). But The byte
data type is an 8-bit
signed two's complement integer. It has a minimum value of -128
and a maximum value of 127
(inclusive). That is why this function is writing the low-order byte of each char in the string from first to last. Any information in the high-order byte is lost. In other words, it assumes the string contains only characters whose value is between 0
and 255
.
You may look into the writeUTF(String s)
method, which, retains the information in the high-order byte as well as the length of the string. First it writes the number of characters in the string onto the underlying output stream as a 2-byte unsigned int between 0
and 65,535
. Next it encodes the string in UTF-8
and writes the bytes of the encoded string to the underlying output stream. This allows a data input stream reading those bytes to completely reconstruct the string.
Upvotes: 5
Reputation: 77226
Most Western programmers tend to think in terms of ASCII, where one character equals one byte, but Java String
s are 16-bit Unicode. writeBytes
just writes out the lower byte, which for ASCII/ISO-8859-1 is the "character" in the C sense.
Upvotes: 7