mn2609
mn2609

Reputation: 53

Webscraping with Python, BeautifulSoup findAll() doesnt find all

I am new to Python and am currently trying to build a webscraper to learn the language. I want save all listings from https://www.notebooksbilliger.de/studentenprogramm/notebooks, which is all Notebooks that fall under the category of Student offers from this site.

from urllib.request import urlopen
from bs4 import BeautifulSoup as soup

my_url = 'https://www.notebooksbilliger.de/studentenprogramm/notebooks'

uClient = urlopen(my_url)
page_html = uClient.read()
uClient.close()

page_soup = soup(page_html, "html.parser")
containers = page_soup.findAll("div", {"class":"mouseover clearfix"})

I am trying things out in the console as well, but when I check the len of containers, this is the output I get:

>>> len(containers)
1

Which can't be right, since the listings per page are set to 50. I have tried searching with different parameters, but I always just seem to find one item, then the search stops.

I am a little lost right now and can't quite figure out how to fix this problem. Any help?

Greetings :)

Upvotes: 0

Views: 136

Answers (1)

mn2609
mn2609

Reputation: 53

Well, this is embarassing.

Just after I posted it (in my defense after multiple searches and endless trying around), I realized that html classes can't contain spaces and that mouseover clearfix is actually 2 classes. This works:

containers = page_soup.findAll("div", {"class":"mouseover"})

Upvotes: 3

Related Questions