Reputation: 189
I am using Python Scrapy tool to extract Data from website. I am able to scrape the Data. Now I want the count of Items scraped from a particular Website. How can I get the Number of items scraped? Is there some built in class for that in Scrapy? Any help will be appreciated. Thanks..
Upvotes: 1
Views: 3290
Reputation: 4828
Based on the example here, I solved the same problem like this:
1.write a custom web service like this to count the item downloaded:
from scrapy.webservice import JsonResource
from scrapy import signals
from scrapy.xlib.pydispatch import dispatcher
class ItemCountResource(JsonResource):
ws_name = 'item_count'
def __init__(self, crawler, spider_name=None):
JsonResource.__init__(self, crawler)
self.item_scraped_count = 0
dispatcher.connect(self.scraped, signals.item_scraped)
self._spider_name = spider_name
self.isLeaf = spider_name is not None
def scraped(self):
self.item_scraped_count += 1
def render_GET(self, txrequest):
return self.item_scraped_count
def getChild(self, name, txrequest):
return ItemCountResource(name, self.crawler)
2.register the service in settings.py
like this:
WEBSERVICE_RESOURCES = {
'path.to.ItemResource.ItemCountResource': 1,
}
3.visite http://localhost:6080/item_count
will get the item crawled.
Upvotes: 3