How write to json file the time when the Scrapy spider has finished scraping?

Question

How an example code on how to record into the json file the time when the Scrapy spider/crawler stops (completes) collecting the data. Example code below:

Example CrawlSpider:

from scrapy.http import request
from scrapy.spiders import CrawlSpider, Rule
from scrapy.linkextractors import LinkExtractor
from ebaycomp.items import EbayItem


class EbaySpider(CrawlSpider):

    name = 'spider'
    allowed_domains = ['ebay.co.uk']

    start_urls = ['https://www.ebay.co.uk/sch/49831/i.html?_from=R40&_nkw=chain+and+sprocket+kit&LH_ItemCondition=1000&rt=nc&LH_PrefLoc=1',
                  'https://www.ebay.co.uk/sch/177771/i.html?_from=R40&_nkw=motorcycle+air+filter&LH_ItemCondition=1000&rt=nc&LH_PrefLoc=1']

    rules = [Rule(LinkExtractor(allow=('.*'),
                                restrict_xpaths=(['//a[@class="s-item__link"][1]',
                                                  '//a[@class="s-item__link"][2]',
                                                  '//a[@class="s-item__link"][3]'
                                                  ])), callback='parse_items', follow=True)]

    def parse_items(self, response):

        scrapedItem = EbayItem()
        scrapedItem['startUrl'] = 'how to properly return the start_url?'
        scrapedItem['productUrl'] = 'how to properly return the 3 product urls?'
        scrapedItem['productTitle'] = response.xpath('//h1/text()').get()
        scrapedItem['productPrice'] = response.xpath('//span[@itemprop="price"]/text()').get()
        scrapedItem['timeClosed'] = 'the time the spider has stopped'

        return scrapedItem

Here's my json pipeline (not sure how to extract time and feed into json output):

class JsonWriterPipeline:

def open_spider(self, spider):
    self.file = open('ebay_out.jl', 'w')

def close_spider(self, spider):
    self.file.close()

def process_item(self, productItem, spider):
    line = json.dumps(ItemAdapter(productItem).asdict()) + "
"
    self.file.write(line)
    return productItem

How write to json file the time when the Scrapy spider has finished scraping?

Answers (1)

Related Questions