Does encoding matter when writing to a file?

Question

I was told today that when writing to a file the Encoding in which you write in doesn't matter. I don't know a lot about Encoding but this sounds reasonable considering Encoding is only for reading/viewing?

Does the Encoding in which bytes are read from a file matter? Is the Encoding there only for parsing/display?

ex.

var bytes = getFileBytes();
bytes.remove(new byte[] { 232, 211 });
anotherStream.writeBytes(bytes);
// I'm assuming that Encoding is irrelevant

tripleee · Accepted Answer

What I think somebody might have told you is that if you have to choose between encodings, it doesn't matter which one you pick as long as you stick to it.

This obviously ignores issues like the efficiency of the encoding (if one of them stores your typical data in fewer bytes, obviously use that then).

Consider the opposite scenario - you could write in one encoding and then either (a) forget about ever reading the data back in or (b) read the data incorrectly.

To use a contrived example, let's say you cannot use the letter lowercase i in your data file for some reason. So to store that, you need to encode it somehow. You decide to store it as \48. But now, how do you represent the literal sequence \48 unambiguously, should you ever need to? Ah ha, your encoding can accommodate that, too: store any literal backslash as \5C. But of course, when you read the file back in, you have to decode this encoding, or you will end up with the wrong bytes. (ThÁ&s Á&s more common than you may thÁ&nk!)

Does encoding matter when writing to a file?

Answers (2)

Related Questions