user2492364
user2492364

Reputation: 6713

scrapy:custom pipeline save file with spider name

This is a custom pipeline
AND I want to save the file with the spider name
Here is my code. It will create a json file but only save one data
Please teach me how to edit the code. There should be 10 data in it.

 class JsonWithEncodingPipeline(object):
    # def __init__(self):
    #save file with fixed name
    # self.file = codecs.open('outputbytest.json ', 'w', encoding='utf-8')

    def process_item(self, item, spider):
    #How to save file with dynamic name??
    self.file = codecs.open('%s_outputchiness.json' % spider.name, 'w', encoding='utf-8')

    line = json.dumps(dict(item)) + "\n"   
    self.file.write(line.encode('utf-8').decode("unicode_escape"))
    return item  
    def spider_closed(self, spider):
        self.file.close()

Upvotes: 1

Views: 895

Answers (1)

marven
marven

Reputation: 1846

Opening a file using the 'w' argument overwrites the existing content. What you should do is open the file under open_spider so it will only open it once and not everytime you write an item.

def open_spider(self, spider):
    self.file = codecs.open('%s_outputchiness.json' % spider.name, 'w', encoding='utf-8')

def process_item(self, item, spider):
    line = json.dumps(dict(item)) + "\n"   
    self.file.write(line.encode('utf-8').decode("unicode_escape"))

def spider_closed(self, spider):
    self.file.close()

If you want a different file name every time you run the spider, I suggest including the current date and time in the name.

Upvotes: 3

Related Questions