.findAll() finds nothing from webpage

Question

In order to pull out reviews from google store, I am trying to learn the library beautiful soup. I wrote a code that should get me all reviews (including the star rating, date and name of reviewer) but the output is just an empty list. The problem is probably something very basic that I am just too inexperienced to know of.

from urllib.request import urlopen
from bs4 import BeautifulSoup as soup
my_url = 'https://play.google.com/store/apps/details?id=com.playstudios.popslots&showAllReviews=true'
uclient = urlopen(my_url)
page_html = uclient.read()
uclient.close()
page_soup = soup(page_html, "html.parser")
reviews = page_soup.findAll("div",{"class":'d15Mdf bAhLNe'})
len(reviews)

The output is 0.

What should I do in order to fix this?

9000 · Accepted Answer

Because the class you're looking for is not there.

curl 'https://play.google.com/store/apps/details?id=com.playstudios.popslots&showAllReviews=true' | grep 'd15Mdf bAhLNe'

Almost entire is produced by JavaScript running in the browser, including, I suppose all the interesting bits you're looking for.

If you want to try and scrape such a page, look for scrapers that actually run JavaScript (usually in Chrome running in headless mode).

.findAll() finds nothing from webpage

Answers (2)

Related Questions