Fazle
Fazle

Reputation: 1

Retrieve hidden text from remote page using Selenium

I am trying to extract all the reviews from AirBNB to conduct a sentiment analysis. Now I have a page with 18 reviews here, but the page shows only 6 reviews (some reviews are hidden with a link "show more") and 18 reviews are shown only after clicking "show all 18 reviews".

I am automating the text extraction with Selenium and not all pages have 18 reviews. I am using XPath to locate the div that contains all the reviews, but it seems the reviews are loaded with javascript using

<div data-plugin-in-point id="Reviews_default", data-section-id="reviews_default"....'some padding attributes here' tabindex=-1> 

To find the element I am using:

br.find_element_by_xpath('/html/body/div[4]/div/div/div/div/div/div[1]/main/div/div/div[4]/div/div/div[2]/section').text

br is a reference to the Selenium object

How can I extract all the text from the reviews in this div? I am not posting the whole code with automation here since I think if I can get this one page done I can handle the automation to extract reviews from all 94 pages.

Upvotes: 0

Views: 77

Answers (1)

Rub&#233;n Paredes
Rub&#233;n Paredes

Reputation: 114

Try to use:

element.get_attribute("textContent")

instead of

element.text

Upvotes: 0

Related Questions