Reputation: 65
I have the following function:
def get_info(url):
options = webdriver.ChromeOptions()
options.headless = True
chrome_browser = webdriver.Chrome('./chromedriver', chrome_options=options)
chrome_browser.get(url)
name = chrome_browser.find_element_by_xpath("//h1[contains(@class,'text-center medium-text-left')]").text
winter = WebDriverWait(chrome_browser, 10).until(
EC.presence_of_element_located((By.XPATH, '//*[@id="main-content"]/div[1]/div[1]/div/div[2]/div[5]/div['
'2]/div/div[1]/div[3]/div')))
chrome_browser.quit()
return winter, name
I want to get the width percentage from the winter/spring/summer etc charts on this site: https://www.fragrantica.com/perfume/Christian-Dior/Sauvage-Eau-de-Parfum-48100.html
So I want this function to return the name of the fragrance and the winter row on the HTML page. The season ratings seem to load a bit slower on the site so I've tried to add wait until the HTML row appears. When I click inspect on the winter ratings chart I am given this element:
<div style="border-radius: 0.2rem; height: 0.3rem; background: rgb(120, 214, 240); width: 90.3491%; opacity: 1;"></div>
Firstly, BeautifulSoup did not find it so I tried Selenium. Selenium does not find it, and when using the WebDriverWait it just shows me this error:
Traceback (most recent call last):
File "D:/Fragrance selector/main.py", line 16, in <module>
s = info.get_info('https://www.fragrantica.com/perfume/Christian-Dior/Sauvage-Eau-de-Parfum-48100.html')
File "D:\Fragrance selector\fragrance_info_from_net.py", line 24, in get_info
winter = WebDriverWait(chrome_browser, 10).until(
File "D:\Fragrance selector\venv\lib\site-packages\selenium\webdriver\support\wait.py", line 80, in until
raise TimeoutException(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message:
I am genuinely out of ideas with this problem. I have no more ideas on how to get that width percentage from the ratings. I would really appreciate it if some of you could help me figure this out.
Upvotes: 1
Views: 1185
Reputation: 530
I modified the wait in your WebDriverWait to 30 seconds and added a get_attribute to return the correct data. See below for an example:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as ec
from selenium.webdriver.support.wait import WebDriverWait
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
options = Options()
options.add_argument('--no-sandbox')
options.add_argument('--start-maximized')
options.add_argument('--disable-extensions')
options.add_argument('--disable-dev-shm-usage')
options.add_argument('--ignore-certificate-errors')
service = Service(ChromeDriverManager().install())
driver = webdriver.Chrome(service=service, options=options)
driver.get('https://www.fragrantica.com/perfume/Christian-Dior/Sauvage-Eau-de-Parfum-48100.html')
name = driver.find_element(By.XPATH, "//h1[contains(@class,'text-center medium-text-left')]").text
winter = WebDriverWait(driver, 30).until(ec.presence_of_element_located((By.XPATH, '//*[@id="main-content"]/div[1]/div[1]/div/div[2]/div[4]/div[2]/div/div[1]/div[3]/div/div')))
t = winter.get_attribute('style')
driver.quit()
To get all of the ratings (Faces and Seasons) into a list, please use the following code:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as ec
from selenium.webdriver.support.wait import WebDriverWait
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
new = []
options = Options()
options.add_argument('--no-sandbox')
options.add_argument('--start-maximized')
options.add_argument('--disable-extensions')
options.add_argument('--disable-dev-shm-usage')
options.add_argument('--ignore-certificate-errors')
service = Service(ChromeDriverManager().install())
driver = webdriver.Chrome(service=service, options=options)
driver.get('https://www.fragrantica.com/perfume/Christian-Dior/Sauvage-Eau-de-Parfum-48100.html')
name = driver.find_element(By.XPATH, "//h1[contains(@class,'text-center medium-text-left')]").text
WebDriverWait(driver, 30).until(ec.presence_of_element_located((By.XPATH, '//*[@id="main-content"]/div[1]/div[1]/div/div[2]/div[4]/div[2]/div/div[1]/div[3]/div/div')))
ratings = driver.find_elements(By.XPATH, './/div[@style="width: 100%; height: 0.3rem; border-radius: 0.2rem; background: rgba(204, 224, 239, 0.4);"]')
for style in ratings:
new.append(style.find_element(By.TAG_NAME, 'div').get_attribute('style'))
print(new)
driver.quit()
This will give an ordered list (starting with the Faces, then moving to the Seasons) of the % which is shown by the bar.
Upvotes: 2