Reputation: 119
I'd like to extract next pages from an initial website with the function response.css()
from the scrapy library. I don't find any hints on how to use that function when the link for further pages are embedded like this:
<li style="text-align: left;"><a href="/the/desired/link">NameOfPage</a></li>
Is this possible with scrapy or should I use anything else like BeautifulSoup?
Upvotes: 0
Views: 619
Reputation: 1
For the people who still search for this answer. You can try this:
response.css("li [style='text-align: left;'] a::attr(href)").get()
Upvotes: 0
Reputation: 2264
I'm not entirely sure if it can be achieved using css
, but with xpath
it is quite easy to express:
response.xpath('//li[contains(@style, "text-align: left;")]')
xpath expressions are really powerful, you might give them a try before pulling in another library.
Upvotes: 1