merlin
merlin

Reputation: 2917

How to add custom information to json file with scrapy in python

I am exporting data from an item to a json file with srapy's jsonitemexporter. Now I would like to add some basic information about the data to the file, e.g. partner name or pagename.

Putting this code into

class BidPipeline(object):

file = None

def open_spider(self, spider):
    self.file = open('data/'+  datetime.datetime.now().strftime ("%Y%m%d") + '_' + spider.name + '.json', 'wb')
    self.exporter = JsonItemExporter(self.file)

    # trying to add partner info        
    a = {'partner': 3}
    line = json.dumps(a) + "\n"
    self.file.write(line)

    self.exporter.start_exporting()

Results in traceback:

yield self.engine.open_spider(self.spider, start_requests)
builtins.TypeError: a bytes-like object is required, not 'str'

My goal is to add some info to the json file before starting the export of the items, so later while processing the data one can determine e.g. the source.

What would be the best way to achieve this?

Upvotes: 1

Views: 108

Answers (1)

Granitosaurus
Granitosaurus

Reputation: 21436

There error is pretty self explanatory here:

a bytes-like object is required, not 'str'

You open file to write bytes (wb) and you try to write string:

def open_spider(self, spider):
    self.file = open(..., 'wb')
                          ^^^^^
    ...
    a = {'partner': 3}
    line = json.dumps(a) + "\n"
                           ^^^^
    self.file.write(line)

To resolve this either open file as string file (just w instead of wb) or encode your line before writing it to file:

    self.file.write(line.encode())

Preferably you should always use w when writing text and wb when writting bytes (e.g. image data)

Upvotes: 1

Related Questions