Daniel
Daniel

Reputation: 28074

How make InputStreamReader fail on invalid data for encoding?

I have some bytes which should be UTF-8 encoded, but which may contain a text is ISO8859-1 encoding, if the user somehow didn't manage to use his text editor the right way.

I read the file with an InputStreamReader:

InputStreamReader reader = new InputStreamReader( 
    new FileInputStream(file), Charset.forName("UTF-8"));

But every time the user uses umlauts like "ä", which are invalid UTF-8 when stored in ISO8859-1 the InputStreamReader does not complain but adds placeholder characters.

Is there is simple way to make this throw an Exception on invalid input?

Upvotes: 7

Views: 1522

Answers (2)

Esailija
Esailija

Reputation: 140210

Simply add .newDecoder():

InputStreamReader reader = new InputStreamReader( 
    new FileInputStream(file), Charset.forName("UTF-8").newDecoder());

Upvotes: 1

Mikhail Vladimirov
Mikhail Vladimirov

Reputation: 13890

CharsetDecoder decoder = Charset.forName("UTF-8").newDecoder();
decoder.onMalformedInput(CodingErrorAction.REPORT);
decoder.onUnmappableCharacter(CodingErrorAction.REPORT);
InputStreamReader reader = new InputStreamReader(
    new FileInputStream(file), decoder);

Upvotes: 7

Related Questions