tklein
tklein

Reputation: 77

Failing to scrape web data with Selenium

I'm trying to fetch data from the front page table on https://icostats.com/. But something just isn't clicking.

from selenium import webdriver

browser = webdriver.Chrome(executable_path=r'C:\Scrapers\chromedriver.exe')
browser.get("https://icostats.com")
browser.find_element_by_xpath("""//*[@id="app"]/div/div[2]/div[2]/div[2]/div[2]/div[8]/span/span""").s()
posts = browser.find_element_by_class_name("tdPrimary-0-75")
for post in posts:
    print(post.text)

The errors I'm getting:

*

C:\Python36\python.exe C:/.../PycharmProjects/PyQtPS/ICO_spyder.py Traceback (most recent call last): File "C:/.../PycharmProjects/PyQtPS/ICO_spyder.py", line 5, in browser.find_element_by_xpath("""//[@id="app"]/div/div[2]/div[2]/div[2]/div[1]/div[2]""").click() File "C:\Python36\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 313, in find_element_by_xpath return self.find_element(by=By.XPATH, value=xpath) File "C:\Python36\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 791, in find_element 'value': value})['value'] File "C:\Python36\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 256, in execute self.error_handler.check_response(response) File "C:\Python36\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 194, in check_response raise exception_class(message, screen, stacktrace) selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"//[@id="app"]/div/div[2]/div[2]/div[2]/div[1]/div[2]"} (Session info: chrome=59.0.3071.115) (Driver info: chromedriver=2.30.477700 (0057494ad8732195794a7b32078424f92a5fce41),platform=Windows NT 6.1.7600 x86_64)

*

EDIT

Finally got it working:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait as wait

browser = webdriver.Chrome(executable_path=r'C:\Scrapers\chromedriver.exe')
browser.get("https://icostats.com")
wait(browser, 10).until(EC.presence_of_element_located((By.CSS_SELECTOR, "#app > div > div.container-0-16 > div.table-0-20 > div.tbody-0-21 > div:nth-child(2) > div:nth-child(8)")))

posts = browser.find_elements_by_class_name("thName-0-55")
for post in posts:
    print(post.text)

posts = browser.find_elements_by_class_name("tdName-0-73")
for post in posts:
    print(post.text)

Is there any way to iterate over every header/column and export it to a csv file without having to go through each class like this?

Upvotes: 0

Views: 1228

Answers (2)

Huang
Huang

Reputation: 609

  1. Seems like there is no s() method in this line

browser.find_element_by_xpath("""//*[@id="app"]/div/div[2]/div[2]/div[2]/div[2]/div[8]/span/span""").s()

so, what you need might be

browser.find_element_by_xpath("""//*[@id="app"]/div/div[2]/div[2]/div[2]/div[2]/div[8]/span/span""").text
  1. Since you want to iterate on the results, this line:

    posts = browser.find_element_by_class_name("tdPrimary-0-75")

should be

posts = browser.find_elements_by_class_name("tdPrimary-0-75")

Upvotes: 1

Andersson
Andersson

Reputation: 52665

Required data generated dynamically by JavaScript. You need to wait until it present on the page:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait as wait

browser = webdriver.Chrome(executable_path=r'C:\Scrapers\chromedriver.exe')
browser.get("https://icostats.com")
wait(browser, 10).until(EC.presence_of_element_located((By.CSS_SELECTOR, "div#app>div")))
posts = browser.find_element_by_class_name("tdPrimary-0-75")
for post in posts:
    print(post.text)

Upvotes: 1

Related Questions