Trying to get score from Rotten Tomatoes URL

Question

I'm trying to get original score from this link: https://www.rottentomatoes.com/m/avengers_endgame/reviews

I am getting something returned, but it's not what I'm expecting. I'm getting a list of something like this:

(I just gave an example of one, but there is a list of these)

Here's what I'm using to get this result:

rows = driver.find_elements_by_xpath("//div[@class='row review_table_row']")
scores =[]
for row in rows:
    scores.append(row.find_elements_by_xpath('//div[contains(concat(" ",normalize-space(@class)," ")," review_desc ")]//div[contains(concat(" ",normalize-space(@class)," ")," small ")]'))

Does anyone know what's wrong with xpath?

wp78de · Accepted Answer

We could use a bit more specific XPath that ensures it contains a score in the first place

driver.get("https://www.rottentomatoes.com/m/avengers_endgame/reviews")
rows = driver.find_elements_by_xpath("//div[@class='row review_table_row' and contains(., 'Original Score:')]")

then the extraction becomes easy

scores =[]
for row in rows:
    scores.append(row.text.split('Original Score: ')[1])

Out[20]: ['2.5/4.0', '4/5', '9/10', '3/5', '2.5/5', '5 / 5', '3.5/4', '4/4', '4/5', '4/5', 'B+', '3.5/4', 'B+', '7/10', '9/10']

Trying to get score from Rotten Tomatoes URL

Answers (2)

Related Questions