Reputation: 19
I have the script below to scrape data from iHerbs.
However, even when I put the driver.close()
so it could stop after the 24th item but it still keeps scraping the data and won't stop.
any solution to stop the loop and close the browser after finishing the 24th item.
Thank you so much!
please check the script as below:
import time
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Chrome(chrome_path)
driver.get("https://ca.iherb.com/c/Vitamins?noi=24")
wait = WebDriverWait(driver, 10)
#close the pop up
wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR,"svg[data-ga-event-action='list-close']"))).click()
#store all the links in a list
item_links = [item.get_attribute("href") for item in wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR,".absolute-link-wrapper > a.product-link")))]
review_titles= list()
review_contents = list()
product_helpful= list()
product_not_helpful = list()
member_rating = list()
total_rate = list()
#iterate over the links
for item_link in item_links:
driver.get(item_link)
#locate and click on the `View All Reviews` link
all_reviews_link = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR,"span.all-reviews-link > a")))
x = all_reviews_link.get_attribute("href")
MAX_PAGE_NUM = 2
for i in range(1, MAX_PAGE_NUM + 1):
page_num = str(i)
url = x +'?&p='+ page_num
print(url)
driver.get(url)
review_containers = driver.find_elements_by_class_name('review-row')
for containers in review_containers:
total_rate.append(driver.find_element_by_class_name('css-i36p8g').text)
review_contents.append(containers.find_element_by_class_name('review-text').text)
product_helpful.append(containers.find_element_by_css_selector('[title="Helpful"] span').text)
product_not_helpful.append(containers.find_element_by_css_selector('[title="Unhelpful"] span').text)
stars = containers.find_elements_by_class_name("css-172co2l")
rating = 0
for star in stars:
star_color = star.find_element_by_tag_name("path").get_attribute("fill")
#print(star_color)
if star_color != "transparent":
rating += 1
member_rating.append(rating)
time.sleep(5) #slow the script down
driver.close()
Upvotes: 0
Views: 1794
Reputation: 334
You could try driver.quit()
. This closes all browsers that were opened with Selenium. The .close()
closes one browser opened with Selenium. Both still work but if the latter does not, then try the former.
For more details, you can view this link
Upvotes: 2
Reputation: 31
import:
import os
delete "browser.close()" and add
os.system("taskkill /im chromedriver.exe")
Upvotes: 0