cyborg
cyborg

Reputation: 878

unable to decode this string using python

I have this text.ucs file which I am trying to decode using python.

file = open('text.ucs', 'r')
content = file.read()
print content

My result is

\xf\xe\x002\22

I tried doing decoding with utf-16, utf-8

content.decode('utf-16')

and getting error

Traceback (most recent call last): File "", line 1, in File "C:\Python27\lib\encodings\utf_16.py", line 16, in decode return codecs.utf_16_decode(input, errors, True) UnicodeDecodeError: 'utf16' codec can't decode bytes in position 32-33: illegal encoding

Please let me know if I am missing anything or my approach is wrong

Edit: Screenshot has been asked enter image description here

Upvotes: 0

Views: 1470

Answers (4)

artur1214
artur1214

Reputation: 424

oooh, as i understand you using python 2.x.x but encoding parameter was added only in python 3.x.x as I know, i am doesn't master of python 2.x.x but you can search in google about io.open for example try:

file = io.open('text.usc', 'r',encoding='utf-8')
content = file.read()
print content

but chek do you need import io module or not

Upvotes: 1

Skiller Dz
Skiller Dz

Reputation: 963

your string need to Be Uncoded With The Coding utf-8 you can do What I Did Now for decode your string

f = open('text.usc', 'r',encoding='utf-8')
print f

Upvotes: 0

filmor
filmor

Reputation: 32258

The string is encoded as UTF16-BE (Big Endian), this works:

content.decode("utf-16-be")

Upvotes: 1

artur1214
artur1214

Reputation: 424

You can specify which encoding to use with the encoding argument:

with open('text.ucs', 'r', encoding='utf-16') as f:
    text = f.read()

Upvotes: 0

Related Questions