Reputation: 497
Trying to convert output into Json format but getting the error. After removing the json.dump getting the data into base64 format. But when using json.dump it shows error.
Code:
import json
import base64
with open(r"C:/Users/Documents/pdf2txt/outputImage.jpg","rb") as img:
image = base64.b64encode(img.read())
data['ProcessedImage'] = image
print(json.dump(data)
Output:
TypeError: Object of type 'bytes' is not JSON serializable
When using:
print(json.dumps(dict(data)))
It's also showing the same error
Upvotes: 8
Views: 35991
Reputation: 123423
First of all, I think you should use json.dumps()
because you're calling json.dump()
with the incorrect arguments and it doesn't return anything to print.
Secondly, as the error message indicates, you can't serializable objects of type bytes
which is what json.dumps()
expects. To do this properly you need to decode the binary data into a Python string with some encoding. To preserve the data properly, you should use latin1
encoding because arbitrary binary strings are valid latin1
which can always be decoded to Unicode and then encoded back to the original string again (as pointed out in this answer by Sven Marnach).
Here's your code showing how to do that (plus corrections for the other not-directly-related problems it had):
import json
import base64
image_path = "C:/Users/Documents/pdf2txt/outputImage.jpg"
data = {}
with open(image_path, "rb") as img:
image = base64.b64encode(img.read()).decode('latin1')
data['ProcessedImage'] = image
print(json.dumps(data))
Upvotes: 5
Reputation: 660
You have to use the str.decode() method.
You are trying to serialize a object of type bytes to a JSON object. There is no such thing in the JSON schema. So you have to convert the bytes to a String first.
Also you should use json.dumps() instead of json.dump() because you dont want to write to a File.
In your example:
import json
import base64
with open(r"C:/Users/Documents/pdf2txt/outputImage.jpg", "rb") as img:
image = base64.b64encode(img.read())
data['ProcessedImage'] = image.decode() # not just image
print(json.dumps(data))
Upvotes: 7
Reputation: 57033
image
(or anythong returned by base64.b64encode
) is a binary bytes
object, not a string. JSON cannot deal with binary data. You must decode the image data if you want to serialize it:
data['ProcessedImage'] = image.decode()
Upvotes: 3