Reputation: 53129
In a simple sever-client application I'm reading data this way:
if((value=in.read())!=-1) {
if(protocol.newChar((char)value, input)) {
//Consider curent buffer data a message
protocol.receiveMessage(input.toString());
//Clear some buffer
input.setLength(0);
}
}
Now in java documentations they say that the method read
reads a single character "as an integer in the range 0 to 65535 (0x00-0xffff), or -1 if the end of the stream has been reached".
Reading characters as integers stuff aside, I'm still confused by the results this function gives me - I'm sending integer as 4 bytes (I put it in byte array and send it).
On the receiving side, I see this in the console:
Received character: [0]
Received character: [0]
Received character: [0]
Received character: [8]
Produced by this code, where current
is a char
returned by in.read()
:
Log.debug("Received character: "+current+" ["+(int)current+"]");
Obviously I'm confused what happened. Did sending function convert bytes back into chars?
During debugging, I have discovered a funny thing - the \0
in netbeans console can be copyed along with other text. During pasting, only data before first \0
will be pasted. (windows 7)
Upvotes: 2
Views: 2590
Reputation: 43391
A Reader
reads char
s, not byte
s. It often does so (eventually, depending on the reader, its delegate, etc) by reading bytes and converting them to char
s, either by a charset you manually specify, or by the system default. InputStreamReader
is generally where this happens. From that class' javadocs:
Each invocation of one of an InputStreamReader's read() methods may cause one or more bytes to be read from the underlying byte-input stream. To enable the efficient conversion of bytes to characters, more bytes may be read ahead from the underlying stream than are necessary to satisfy the current read operation.
So the answer is really, "it's implementation defined," but it'll be at least as many bytes are required to form one char. Without knowing your charset, we can't say what that is; it's 1 for the "usual" chars in UTF-8, 2 for all chars in UTF-16, etc. But InputStreamReader
allows itself wiggle room to read ahead for efficiency, by some indeterminate amount.
If you're using UTF-8 (a common default) and sending the four bytes [0, 0, 0, 8]
, then these correspond to four chars: [\u0000, \u0000, \u0000, \u0008]
. In that case, it would make sense that sending an integer as 4 bytes would cause you to receive 4 chars.
Upvotes: 4
Reputation: 8928
BufferedReader.read() consumes one character's worth of bytes per call. The number of bytes per character depends on the character encoding. On most platforms, the default character encoding is something like UTF-8, which usually uses only a single byte per character.
Note that the platform character encoding may be different from Java's internal representation of characters, which uses two bytes per character.
Upvotes: 0
Reputation: 4190
The behaviour of BufferedReader.read() is the same as described in Reader.read(). The only difference is that it only reads data from the underlying stream if the buffer is empty.
The bytes are converted to characters but this depends on the charset. If the charset is UTF-8 and one byte is higher than 127 you would receive less than 4 characters.
Upvotes: 0
Reputation: 4351
Did sending function convert bytes back into chars?
read
states Reads a single character.
.
Send: You're writing a 4-byte integer, apparently consisting of the bytes 0, 0, 0, 8.
Read: You only read two bytes, one character, each time.
private char cb[];
...
return cb[nextChar++];
So, either you write only two bytes, or you read 4 bytes and interpret them as a 4-bit integer.
We need more of your code to answer your original question.
Upvotes: 0