314mip
314mip

Reputation: 403

Extract max site number from web page

I need ixtract max page number from propertyes web site. Screen: number of site. My code:

import requests
from bs4 import BeautifulSoup

URL = 'https://www.nehnutelnosti.sk/vyhladavanie/'
page = requests.get(URL)

soup = BeautifulSoup(page.content, 'html.parser')
results = soup.find(id = 'inzeraty')

sitenums = soup.find_all('ul', class_='component-pagination__items d-flex align-items-center')
sitenums.find_all('li', class_='component-pagination__item')

My code returns error:

AttributeError: ResultSet object has no attribute 'find_all'. You're probably treating a list of elements like a single element. Did you call find_all() when you meant to call find()?

Thanks for any help.

Upvotes: 0

Views: 44

Answers (2)

QHarr
QHarr

Reputation: 84465

Similar idea but doing faster filtering within css selectors rather than indexing, using nth-last-child

The :nth-last-child() CSS pseudo-class matches elements based on their position among a group of siblings, counting from the end.

import requests
from bs4 import BeautifulSoup as bs

r = requests.get('https://www.nehnutelnosti.sk/vyhladavanie')
soup = bs(r.text, "lxml")
print(int(soup.select_one('.component-pagination__item:nth-last-child(2) a').text.strip()))

Upvotes: 1

baduker
baduker

Reputation: 20042

You could use a css selector and grab the second value from end.

Here's how:

import requests
from bs4 import BeautifulSoup

css = ".component-pagination .component-pagination__item a, .component-pagination .component-pagination__item span"
page = requests.get('https://www.nehnutelnosti.sk/vyhladavanie/')
soup = BeautifulSoup(page.content, 'html.parser').select(css)[-2]
print(soup.getText(strip=True))

Output:

2309

Upvotes: 1

Related Questions