Reputation: 1
I am trying to crawl quotes from the website http://www.quotationspage.com/subjects/character/ my spider code is something like :
class quote(scrapy.Spider): name = 'quotes' # defining Name start_urls = ['http://www.quotationspage.com/subjects/character/'] # Targeted urls
def parse(self, response):
total_count = len(response.xpath('//dl/dt').getall()) # counter for loop
for i in range(1, total_count + 1): # loop for retriving data continuosly
xp_quote = f'//dl/dt[{i}]/a/text()'
xp_writer = f'//dl/dd[{i}]/b/a/text()'
page_quote_writer = response.xpath(xp_writer).get()
page_quote = response.xpath(xp_quote).get()
yield { # dictionay return
'Writer': (page_quote_writer if page_quote_writer != None else 'Unable to fetch'),
'Quote': (page_quote if page_quote != None else 'Unable to fetch')
}
next_page = response.css('#content tbody td a::attr(href)').getall()
print(next_page)
So the problem is I'm unable to get anything in the next_page section, I have already checked with xpath as well but still same issue. Now I know that the problem is with (this) "response.css('#content tbody td a::attr(href)').getall() " , the css selector is not working, so I checked in chrome > Inspect element and copied the xpath and later css from there, but still no luck. weird thing is this the same xpath or css sector is working fine in chrome > inspect elements > find section.
Help is already appreciated.
Upvotes: 0
Views: 226