Reputation: 47
I am trying to enter getMonthEvents. But somehow it seems the callback is never executed. Any ideas? thanks :)
from scrapy.selector import Selector
from scrapy.spiders import CrawlSpider, Rule
from scrapy.linkextractors import LinkExtractor
from scrapy.http import Request
from scrapy.item import Item, Field
class EventItems(Item):
Title = Field()
Link = Field()
Date = Field()
Time = Field()
Place = Field()
Description = Field()
Program=Field()
class SpiderForHSMT(CrawlSpider):
name = 'HMTM'
start_urls = ['http://www.some_website.com']
rules =(Rule( LinkExtractor(restrict_xpaths=('//div[@id="VER_2013_DISPLAYSEARCHRESULTS"]/table[1]/tr[3]'), tags=('a',), attrs=('href',)), callback = 'parseMonth'), )
def parseMonth(self, response):
request = Request(response.url, callback = self.getMonthEvents)
yield request
def getMonthEvents(self, response):
print(response.url)
Upvotes: 1
Views: 2339
Reputation: 2594
While you are replicating your request in parseMonth the requests get filtered as duplicates (see documentation). Add dont_filter=True
to your request so that they are not filtered.
request = Request(response.url, dont_filter=True, callback = self.getMonthEvents)
Upvotes: 2