CSS selector of link to the next page returns empty list in Scrapy shell

I'm new in Scrapy. I try to get link to the next page from this site https://book24.ru/knigi-bestsellery/?section_id=1592

What how html looks like: enter image description here

In scrapy shell I wrote this command:

response.css('li.pagination__button-item._next a::attr(href)')

It returns an empty list.

I have also tried

response.css('a.pagination__item._link._button._next.smartLink')

but it also returns an empty list.

I will be grateful for the help!

Upvotes: 1

Views: 331

Answers (2)

msenior_
msenior_

Reputation: 2110

I would like to add to @SuperUser's answer. Seeing as the site loads the HTML via JavaScript, please read the documentation on how to handle JavaScript websites. scrapy-playwright is a recent library that I have found to be quite fast and easy to use when scraping JS rendered sites.

Upvotes: 0

SuperUser
SuperUser

Reputation: 4822

The page is generated with JavaScript, see how it look with 'view(response)'.

# with css:
In [1]: response.css('head > link:nth-child(28) ::attr(href)').get()                                                   
Out[1]: 'https://book24.ru/knigi-bestsellery/page-2/'

# with xpath:
In [2]: response.xpath('//link[@rel="next"]/@href').get()
Out[2]: 'https://book24.ru/knigi-bestsellery/page-2/'

Upvotes: 1

Related Questions