Reputation: 1487
I wrote the following pipeline so that the extracted items are directly into an excel file, I ran the spider without any errors however the file isn't being saved, I know that it is missing workbook.close()
, problem is I do not know where to put it inside the code.
from datetime import datetime
import xlsxwriter
ordered_list=['Link','Price','Date','discount']
class guiPipeline(object):
def __init__(self):
now = datetime.now()
workbook = xlsxwriter.Workbook('data.xlsx')
self.worksheet = workbook.add_worksheet()
self.write_first_row()
self.index = 1
def process_item(self, item, spider):
for _key,_value in item.items():
col=ordered_list.index(_key)
self.worksheet.write(self.index,col,_value)
self.index+=1
return item
def write_first_row(self):
for header in ordered_list:
col=ordered_list.index(header)
self.worksheet.write(0,col,header)
This is my pipeline, I just need to know how to close()
the workbook when the spider is finished
Upvotes: 2
Views: 47
Reputation: 2183
You have some methods that can be called when the spider is opened or closed: http://doc.scrapy.org/en/latest/topics/item-pipeline.html#close_spider
You can find an example in the docs here: http://doc.scrapy.org/en/latest/topics/item-pipeline.html#write-items-to-a-json-file
You will also have to add this crawler.signals.connect(self.close_spider, signals.spider_closed)
to your def __init__
Upvotes: 1