kishan
kishan

Reputation: 189

How to get number of Items scraped by Python Scrapy tool?

I am using Python Scrapy tool to extract Data from website. I am able to scrape the Data. Now I want the count of Items scraped from a particular Website. How can I get the Number of items scraped? Is there some built in class for that in Scrapy? Any help will be appreciated. Thanks..

Upvotes: 1

Views: 3290

Answers (1)

shellbye
shellbye

Reputation: 4828

Based on the example here, I solved the same problem like this:

1.write a custom web service like this to count the item downloaded:

from scrapy.webservice import JsonResource
from scrapy import signals
from scrapy.xlib.pydispatch import dispatcher


class ItemCountResource(JsonResource):

    ws_name = 'item_count'

    def __init__(self, crawler, spider_name=None):
        JsonResource.__init__(self, crawler)
        self.item_scraped_count = 0
        dispatcher.connect(self.scraped, signals.item_scraped)
        self._spider_name = spider_name
        self.isLeaf = spider_name is not None

    def scraped(self):
        self.item_scraped_count += 1

    def render_GET(self, txrequest):
        return self.item_scraped_count

    def getChild(self, name, txrequest):
        return ItemCountResource(name, self.crawler)

2.register the service in settings.py like this:

WEBSERVICE_RESOURCES = {
    'path.to.ItemResource.ItemCountResource': 1,
}

3.visite http://localhost:6080/item_count will get the item crawled.

Upvotes: 3

Related Questions