Reputation: 311
I'm trying to scrape some Amazon questions and answers, specifically this one: https://www.amazon.com/ask/questions/Tx1AYFFVMESHMZV/ref=ask_ql_ql_al_hza
This a section of the html for each question (An html inspection in developer tools can be done for more detail):
<span class="askExpanderContainer noScriptNotDisplayExpander">
<span class="askShortText">
They definitely help stretch the toes. I'm hoping to avoid a hammer toe that has been developing on one foot, and I'm not sure they're doing that, but I read that one way to avoid hammer toes developing is to stretch the toes, so I figure they will help in the long run and probably won't do any harm. From the beginning…
<a class="a-link-normal askSeeMore" href="#">
see more
</a>
</span>
<span class="askLongText">
They definitely help stretch the toes. I'm hoping to avoid a hammer toe that has been developing on one foot, and I'm not sure they're doing that, but I read that one way to avoid hammer toes developing is to stretch the toes, so I figure they will help in the long run and probably won't do any harm. From the beginning never hurt my toes, so I didn't have an adjustment period like some people have described. I use them every day, and I enjoy being calm and still for half an hour or 45 minutes while all I do is stretch my toes. That's worth something too!
<a class="a-link-normal askSeeLess" href="#">
see less
</a>
</span>
</span>
I need the whole answer, but when I try to look for the askLongText element I get the following error:
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: no such
element: Unable to locate element: {"method":"css selector","selector":"span.askLongText"}
(Session info: chrome=66.0.3359.181)
(Driver info: chromedriver=2.38.552518 (183d19265345f54ce39cbb94cf81ba5f15905011),platform=Mac OS X 10.13.4 x86_64)
However, I can successfully extract the askShortText element.
Here's the python code:
driver.get(url)
title = driver.find_element_by_css_selector('p.a-size-large.askAnswersAndComments.askWrapText').text
answers_section = driver.find_element_by_css_selector('div.a-section.askAnswersAndComments.askWrapText')
answers = answers_section.find_elements_by_xpath('div[@id]')
for ans in answers:
answer = ans.find_element_by_css_selector('span.askLongText').text
print answer
Note: The last three answer elements don't contain the askLongText class. I will handle the exception later but didn't put it here for testing purposes. Either way, the first three elements do contain such class and its content should be printed but this is not happening
Upvotes: 0
Views: 2022
Reputation: 25611
Assuming you want both the long and short answers and that you aren't trying to test the functionality of the "see more" link... with a more complex locator and some python magic, we can:
trim each answer of start and end whitespace
answers = driver.find_elements_by_css_selector("div[id^='answer'] > span:not([class]), div[id^='answer'] span.askLongText")
for answer in answers:
strings = driver.execute_script("return arguments[0].innerText", answer).splitlines()
print([s.strip() for s in strings if s.strip()][0])
Upvotes: 0
Reputation: 819
This worked for me.
see_more_links = driver.find_elements_by_partial_link_text("see more")
for link in see_more_link:
link.click()
answers = driver.find_elements_by_css_selector("span.askLongText")
for answer in answers:
print str(answer.text).replace('see less', '')
OUTPUT:
They definitely help stretch the toes. I'm hoping to avoid a hammer toe that has been developing on one foot, and I'm not sure they're doing that, but I read that one way to avoid hammer toes developing is to stretch the toes, so I figure they will help in the long run and probably won't do any harm. From the beginning never hurt my toes, so I didn't have an adjustment period like some people have described. I use them every day, and I enjoy being calm and still for half an hour or 45 minutes while all I do is stretch my toes. That's worth something too!
They absolutely realign your toes and straighten out your foot bones. You need to use them regularly in the beginning to set them straighter. After your pain abates a little, you can lower the frequency and extend the time you wear them . They've helped my bunions become far less painful over the last several weeks. Well worth the $ IMO, and definitely better than surgery !!
I don't know how long one would have to wear them to permanently improve alignment. Yoga toes are very helpful in releasing foot and toe tension and this does improve walking and balance for me. Because I already have a bunion on one foot I doubt these would help keep alignment in that foot but yoga toes help my feet to adapt better to that bunion.
Upvotes: 2
Reputation: 193128
@GPT14's answer was near perfect but have a small bug as the solution also prints the text see less along with each answer which is essentially not a part of the answer. To extract the exact answer only you can use the following code block :
Code Block:
from selenium import webdriver
options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
options.add_argument('disable-infobars')
driver=webdriver.Chrome(chrome_options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
driver.get("https://www.amazon.com/ask/questions/Tx1AYFFVMESHMZV/ref=ask_ql_ql_al_hza")
see_more_links = driver.find_elements_by_css_selector("span.askShortText>a")
for link in see_more_links:
link.click()
long_answer_texts = driver.find_elements_by_xpath("//span[@class='askLongText']")
for long_answer_text in long_answer_texts:
print(driver.execute_script('return arguments[0].firstChild.textContent;', long_answer_text).strip())
Console Output:
They definitely help stretch the toes. I'm hoping to avoid a hammer toe that has been developing on one foot, and I'm not sure they're doing that, but I read that one way to avoid hammer toes developing is to stretch the toes, so I figure they will help in the long run and probably won't do any harm. From the beginning never hurt my toes, so I didn't have an adjustment period like some people have described. I use them every day, and I enjoy being calm and still for half an hour or 45 minutes while all I do is stretch my toes. That's worth something too!
They absolutely realign your toes and straighten out your foot bones. You need to use them regularly in the beginning to set them straighter. After your pain abates a little, you can lower the frequency and extend the time you wear them . They've helped my bunions become far less painful over the last several weeks. Well worth the $ IMO, and definitely better than surgery !!
I don't know how long one would have to wear them to permanently improve alignment. Yoga toes are very helpful in releasing foot and toe tension and this does improve walking and balance for me. Because I already have a bunion on one foot I doubt these would help keep alignment in that foot but yoga toes help my feet to adapt better to that bunion.
Upvotes: 1
Reputation: 732
I have debugged your code. The line of code mentioned below :
answer = ans.find_element_by_css_selector('span.askLongText').text
is throwing an exception because the element span
doesn't contain only text. There is an anchor
tag within the span
tag. For fetching the full content of the element span
, you have to use .get_attribute('innerHTML')
.
You will have to change the line of code to:
answer = ans.find_element_by_css_selector('span.askLongText').get_attribute('innerHTML')
In the answer, you will get the full content of the span
element.
Upvotes: 0
Reputation: 22440
If you wish to get all the answers in a single shot, the below script is worth trying for:
from selenium import webdriver
driver = webdriver.Chrome()
driver.get("https://www.amazon.com/ask/questions/Tx1AYFFVMESHMZV/ref=ask_ql_ql_al_hza")
for showmore in driver.find_elements_by_css_selector(".askSeeMore"):
showmore.click()
for ans in driver.find_elements_by_css_selector("[id^='answer-']"):
if "askLongText" in ans.get_attribute("class"):
print(ans.find_element_by_css_selector(".askLongText").text)
else:
print(ans.find_element_by_css_selector("span").text)
driver.quit()
Upvotes: 0