Reputation: 6934
I understand that Java character streams wrap byte streams such that the underlying byte stream is interpreted as per the system default or an otherwise specifically defined character set.
My systems default char-set is UTF-8.
If I use a FileReader
to read in a text file, everything looks normal as the default char-set is used to interpret the bytes from the underlying InputStreamReader
. If I explicitly define an InputStreamReader
to read the UTF-8 encoded text file in as UTF-16, everything obviously looks strange. Using a byte stream like FileInputStream
and redirecting its output to System.out, everything looks fine.
So, my questions are;
Why is it useful to use a character stream?
Why would I use a character stream instead of directly using a byte stream?
When is it useful to define a specific char-set?
Upvotes: 0
Views: 1303
Reputation: 1501636
Code that deals with strings should only "think" in terms of text - for example, reading an input source line by line, you don't want to care about the nature of that source.
However, storage is usually byte-oriented - so you need to create a conversion between the byte-oriented view of a source (encapsulated by InputStream
) and the character-oriented view of a source (encapsulated by Reader
).
So a method which (say) counts the lines of text in an input source should take a Reader
parameter. If you want to count the lines of text in two files, one of which is encoded in UTF-8 and one of which is encoded in UTF-16, you'd create an InputStreamReader
around a FileInputStream
for each file, specifying the appropriate encoding each time.
(Personally I would avoid FileReader
completely - the fact that it doesn't let you specify an encoding makes it useless IMO.)
Upvotes: 6
Reputation: 533620
When you are reading/writing text which contains characters which could be > 127 , use a char stream. When you are reading/writing binary data use a byte stream.
You cna read text as binary if you wish, but unless you make alot of assumptions it rarely gains you much.
Upvotes: 1
Reputation: 6875
An InputStream
reads bytes, while a Reader
reads characters. Because of the way bytes map to characters, you need to specify the character set (or encoding) when you create an InputStreamReader
, the default being the platform character set.
Upvotes: 3