rkhan8
rkhan8

Reputation: 43

base64 decode output has non-ascii characters

I am having trouble properly decoding base64 data. It decodes the message properly, but also includes a ton of non-ascii characters which then I have to clean as well, so I was wondering if I was decoding it incorrectly or if I will need to create a script to clean the text post decoding. Below is the python code and part of the output I am getting to illustrate. Thanks!

message= base64.b64decode(base64_message).decode(errors='ignore')

enter image description here

Upvotes: 0

Views: 2750

Answers (1)

lenik
lenik

Reputation: 23498

You're obviously trying to decode a Word document, which is by definition not plain text at all. Make sure what you're trying to decode is text. Otherwise save the decoding result to a file (file.docx?) and open it in the appropriate application.


Following up your question in the comments, you don't have to get the text from base64, leave it as it is and write to the file. Instead of

base64.b64decode(base64_message).decode(errors='ignore')

use just

base64.b64decode(base64_message)

and everything will be fine:

>>> a = base64.b64encode('\x01\x02\x04')
>>> a
'AQIE'
>>> base64.b64decode(a)
'\x01\x02\x04'

Upvotes: 1

Related Questions