Zhibin
Zhibin

Reputation: 1049

TypeError: Object of type 'bytes' is not JSON serializable

I just started programming Python. I want to use scrapy to create a bot,and it showed TypeError: Object of type 'bytes' is not JSON serializable when I run the project.

import json
import codecs

class W3SchoolPipeline(object):

  def __init__(self):
      self.file = codecs.open('w3school_data_utf8.json', 'wb', encoding='utf-8')

  def process_item(self, item, spider):
      line = json.dumps(dict(item)) + '\n'
      # print line

      self.file.write(line.decode("unicode_escape"))
      return item

from scrapy.spiders import Spider
from scrapy.selector import Selector
from w3school.items import W3schoolItem

class W3schoolSpider(Spider):

    name = "w3school"
    allowed_domains = ["w3school.com.cn"]

    start_urls = [
        "http://www.w3school.com.cn/xml/xml_syntax.asp"
    ]

    def parse(self, response):
        sel = Selector(response)
        sites = sel.xpath('//div[@id="navsecond"]/div[@id="course"]/ul[1]/li')

    items = []
    for site in sites:
        item = W3schoolItem()
        title = site.xpath('a/text()').extract()
        link = site.xpath('a/@href').extract()
        desc = site.xpath('a/@title').extract()

        item['title'] = [t.encode('utf-8') for t in title]
        item['link'] = [l.encode('utf-8') for l in link]
        item['desc'] = [d.encode('utf-8') for d in desc]
        items.append(item)
        return items

Traceback:

TypeError: Object of type 'bytes' is not JSON serializable
2017-06-23 01:41:15 [scrapy.core.scraper] ERROR: Error processing       {'desc': [b'\x
e4\xbd\xbf\xe7\x94\xa8 XSLT \xe6\x98\xbe\xe7\xa4\xba XML'],
 'link': [b'/xml/xml_xsl.asp'],
 'title': [b'XML XSLT']}

Traceback (most recent call last):
File  
"c:\users\administrator\appdata\local\programs\python\python36\lib\site-p
ackages\twisted\internet\defer.py", line 653, in _runCallbacks
    current.result = callback(current.result, *args, **kw)
File "D:\LZZZZB\w3school\w3school\pipelines.py", line 19, in process_item
    line = json.dumps(dict(item)) + '\n'
File 
"c:\users\administrator\appdata\local\programs\python\python36\lib\json\_
_init__.py", line 231, in dumps
    return _default_encoder.encode(obj)
File 
"c:\users\administrator\appdata\local\programs\python\python36\lib\json\e
ncoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
File  
"c:\users\administrator\appdata\local\programs\python\python36\lib\json\e
ncoder.py", line 257, in iterencode
    return _iterencode(o, 0)
File      
"c:\users\administrator\appdata\local\programs\python\python36\lib\
json\encoder.py", line 180, in default
    o.__class__.__name__)
  TypeError: Object of type 'bytes' is not JSON serializable

Upvotes: 103

Views: 365335

Answers (7)

iretex
iretex

Reputation: 69

In my case, I fixed the issue by simply casting the line to str object before writing to a file.

import json

# Specify the file path where you want to save the JSON data
file_path = f"{machine_number}_can_data.json"

# Write the list to a JSON file
with open(file_path, 'w', encoding='utf-8') as json_file:
    json.dump(str(deserialized_value.get('CanMessages')), json_file)

Upvotes: 0

Carlos Cares
Carlos Cares

Reputation: 11

#This works for me:

thestrvar = base64.b64encode(thevartocode.encode('utf-8')).decode()

thestrvar looks like PGh0bWw+DQo8Ym9k... then

rec = {}
rec['myidx'] = thestrvar
print(json.dumps(rec))   #without problems

Upvotes: -2

Meta Airdroper
Meta Airdroper

Reputation: 23

I fixed my own by adding .hex() at the end of the variable

Example

test = "0xfggfddr"
return test.hex()

Upvotes: 1

Jiten Patel
Jiten Patel

Reputation: 180

I had the same error while playing with image data. In Python, on the client-side, I was reading an image file and sending it to my server. My server decodes binary data into bytes. So, I just converted my bytes data into a string that solved my error.

image_bytes =  file.read()
data = {"key1": "value1", "key2" :  "value2", "image" : str(image_bytes)}

Upvotes: 2

Furqan Ali
Furqan Ali

Reputation: 727

Simply write <variable name>.decode("utf-8").

For example:

myvar = b'asdqweasdasd'
myvar.decode("utf-8")

Upvotes: 49

Jordan Donovan
Jordan Donovan

Reputation: 91

I was dealing with this issue today, and I knew that I had something encoded as a bytes object that I was trying to serialize as json with json.dump(my_json_object, write_to_file.json). my_json_object in this case was a very large json object that I had created, so I had several dicts, lists, and strings to look at to find what was still in bytes format.

The way I ended up solving it: the write_to_file.json will have everything up to the bytes object that is causing the issue.

In my particular case this was a line obtained through

for line in text:
    json_object['line'] = line.strip()

I solved by first finding this error with the help of the write_to_file.json, then by correcting it to:

for line in text:
    json_object['line'] = line.strip().decode()

Upvotes: 8

Martijn Pieters
Martijn Pieters

Reputation: 1123350

You are creating those bytes objects yourself:

item['title'] = [t.encode('utf-8') for t in title]
item['link'] = [l.encode('utf-8') for l in link]
item['desc'] = [d.encode('utf-8') for d in desc]
items.append(item)

Each of those t.encode(), l.encode() and d.encode() calls creates a bytes string. Do not do this, leave it to the JSON format to serialise these.

Next, you are making several other errors; you are encoding too much where there is no need to. Leave it to the json module and the standard file object returned by the open() call to handle encoding.

You also don't need to convert your items list to a dictionary; it'll already be an object that can be JSON encoded directly:

class W3SchoolPipeline(object):    
    def __init__(self):
        self.file = open('w3school_data_utf8.json', 'w', encoding='utf-8')

    def process_item(self, item, spider):
        line = json.dumps(item) + '\n'
        self.file.write(line)
        return item

I'm guessing you followed a tutorial that assumed Python 2, you are using Python 3 instead. I strongly suggest you find a different tutorial; not only is it written for an outdated version of Python, if it is advocating line.decode('unicode_escape') it is teaching some extremely bad habits that'll lead to hard-to-track bugs. I can recommend you look at Think Python, 2nd edition for a good, free, book on learning Python 3.

Upvotes: 52

Related Questions