Reputation: 1
I am trying to extract all the reviews from AirBNB to conduct a sentiment analysis. Now I have a page with 18 reviews here, but the page shows only 6 reviews (some reviews are hidden with a link "show more") and 18 reviews are shown only after clicking "show all 18 reviews".
I am automating the text extraction with Selenium and not all pages have 18 reviews. I am using XPath to locate the div that contains all the reviews, but it seems the reviews are loaded with javascript using
<div data-plugin-in-point id="Reviews_default", data-section-id="reviews_default"....'some padding attributes here' tabindex=-1>
To find the element I am using:
br.find_element_by_xpath('/html/body/div[4]/div/div/div/div/div/div[1]/main/div/div/div[4]/div/div/div[2]/section').text
br
is a reference to the Selenium object
How can I extract all the text from the reviews in this div? I am not posting the whole code with automation here since I think if I can get this one page done I can handle the automation to extract reviews from all 94 pages.
Upvotes: 0
Views: 77
Reputation: 114
Try to use:
element.get_attribute("textContent")
instead of
element.text
Upvotes: 0