Check if bytes result in valid ISO 8859-15 (Latin) in Python

Question

I want to test if a string of bytes that I'm extracting from a file results in valid ISO-8859-15 encoded text. The first thing I came across is this similar case about UTF-8 validation:

https://stackoverflow.com/a/5259160/1209004

So based on that, I thought I was being clever by doing something similar for ISO-8859-15. See the following demo code:

#! /usr/bin/env python
#

def isValidISO885915(bytes):
    # Test if bytes result in valid ISO-8859-15
    try:
        bytes.decode('iso-8859-15', 'strict')
        return(True)
    except UnicodeDecodeError:
        return(False)

def main():
    # Test bytes (byte x95 is not defined in ISO-8859-15!)
    bytes = b'\x4A\x70\x79\x6C\x79\x7A\x65\x72\x20\x64\x95\x6D\x6F\xFF'

    isValidLatin = isValidISO885915(bytes)
    print(isValidLatin)

main()

However, running this returns True, even though x95 is not a valid code point in ISO-8859-15! Am I overlooking something really obvious here? (BTW I tried this with Python 2.7.4 and 3.3, results are identical in both cases).

Check if bytes result in valid ISO 8859-15 (Latin) in Python

Answers (1)

Related Questions