Reputation: 779
I'm trying to get original score from this link: https://www.rottentomatoes.com/m/avengers_endgame/reviews
I am getting something returned, but it's not what I'm expecting. I'm getting a list of something like this:
[<selenium.webdriver.remote.webelement.WebElement (session="e31633b4d123fff6432c6726cc26ad65", element="a9b96431-d527-4b79-8ddd-01d18f362cbd")>
(I just gave an example of one, but there is a list of these)
Here's what I'm using to get this result:
rows = driver.find_elements_by_xpath("//div[@class='row review_table_row']")
scores =[]
for row in rows:
scores.append(row.find_elements_by_xpath('//div[contains(concat(" ",normalize-space(@class)," ")," review_desc ")]//div[contains(concat(" ",normalize-space(@class)," ")," small ")]'))
Does anyone know what's wrong with xpath?
Upvotes: 1
Views: 289
Reputation: 18960
We could use a bit more specific XPath that ensures it contains a score in the first place
driver.get("https://www.rottentomatoes.com/m/avengers_endgame/reviews")
rows = driver.find_elements_by_xpath("//div[@class='row review_table_row' and contains(., 'Original Score:')]")
then the extraction becomes easy
scores =[]
for row in rows:
scores.append(row.text.split('Original Score: ')[1])
Out[20]: ['2.5/4.0', '4/5', '9/10', '3/5', '2.5/5', '5 / 5', '3.5/4', '4/4', '4/5', '4/5', 'B+', '3.5/4', 'B+', '7/10', '9/10']
Upvotes: 1
Reputation: 138
You can try this:
response.xpath('//div[@class="small subtle review-link"]').get().split('Original Score: ')[1].split('\n')[0]
And the ending result would be:
'2.5/4.0'
Upvotes: 1