Reputation: 878
I have this text.ucs file which I am trying to decode using python.
file = open('text.ucs', 'r')
content = file.read()
print content
My result is
\xf\xe\x002\22
I tried doing decoding with utf-16, utf-8
content.decode('utf-16')
and getting error
Traceback (most recent call last): File "", line 1, in File "C:\Python27\lib\encodings\utf_16.py", line 16, in decode return codecs.utf_16_decode(input, errors, True) UnicodeDecodeError: 'utf16' codec can't decode bytes in position 32-33: illegal encoding
Please let me know if I am missing anything or my approach is wrong
Edit: Screenshot has been asked
Upvotes: 0
Views: 1470
Reputation: 424
oooh, as i understand you using python 2.x.x but encoding parameter was added only in python 3.x.x as I know, i am doesn't master of python 2.x.x but you can search in google about io.open for example try:
file = io.open('text.usc', 'r',encoding='utf-8')
content = file.read()
print content
but chek do you need import io module or not
Upvotes: 1
Reputation: 963
your string need to Be Uncoded With The Coding utf-8 you can do What I Did Now for decode your string
f = open('text.usc', 'r',encoding='utf-8')
print f
Upvotes: 0
Reputation: 32258
The string is encoded as UTF16-BE (Big Endian), this works:
content.decode("utf-16-be")
Upvotes: 1
Reputation: 424
You can specify which encoding to use with the encoding
argument:
with open('text.ucs', 'r', encoding='utf-16') as f:
text = f.read()
Upvotes: 0