wulfgarpro
wulfgarpro

Reputation: 6934

Why character streams?

I understand that Java character streams wrap byte streams such that the underlying byte stream is interpreted as per the system default or an otherwise specifically defined character set.

My systems default char-set is UTF-8.

If I use a FileReader to read in a text file, everything looks normal as the default char-set is used to interpret the bytes from the underlying InputStreamReader. If I explicitly define an InputStreamReader to read the UTF-8 encoded text file in as UTF-16, everything obviously looks strange. Using a byte stream like FileInputStream and redirecting its output to System.out, everything looks fine.

So, my questions are;

Upvotes: 0

Views: 1303

Answers (3)

Jon Skeet
Jon Skeet

Reputation: 1501636

Code that deals with strings should only "think" in terms of text - for example, reading an input source line by line, you don't want to care about the nature of that source.

However, storage is usually byte-oriented - so you need to create a conversion between the byte-oriented view of a source (encapsulated by InputStream) and the character-oriented view of a source (encapsulated by Reader).

So a method which (say) counts the lines of text in an input source should take a Reader parameter. If you want to count the lines of text in two files, one of which is encoded in UTF-8 and one of which is encoded in UTF-16, you'd create an InputStreamReader around a FileInputStream for each file, specifying the appropriate encoding each time.

(Personally I would avoid FileReader completely - the fact that it doesn't let you specify an encoding makes it useless IMO.)

Upvotes: 6

Peter Lawrey
Peter Lawrey

Reputation: 533620

When you are reading/writing text which contains characters which could be > 127 , use a char stream. When you are reading/writing binary data use a byte stream.

You cna read text as binary if you wish, but unless you make alot of assumptions it rarely gains you much.

Upvotes: 1

Laurent Pireyn
Laurent Pireyn

Reputation: 6875

An InputStream reads bytes, while a Reader reads characters. Because of the way bytes map to characters, you need to specify the character set (or encoding) when you create an InputStreamReader, the default being the platform character set.

Upvotes: 3

Related Questions