Reputation: 1
I am trying to use xpath to scrape reddit posts from a forum. One of the functions I want the spider to achieve is to automatically go to the next page as soon as it finishes scrapping from the current page. The page html code looks like this:
<span class="next-button"><a href="https://www.reddit.com/r/InteriorDesign/?count=975&after=t3_8ol7yp" rel="nofollow next" >next ›</a></span>
and I used the xpath selector as: response.xpath("//a[@class = 'next-button']") but it didn't give me anything back. Can someone help me figure out why?
thanks! Hao
Upvotes: 0
Views: 88
Reputation: 29022
The @class
attribute is on the span
element and not the a
link element. So change your XPath to
response.xpath("//span[@class = 'next-button']/a")
to select a
or
response.xpath("//span[@class = 'next-button']/a/@href")
to get the link address.
Upvotes: 1