Reputation: 178
I wanted to scrape the table on this link. I was trying to use selenium to get the data after the page loaded but I was unsuccessful. Any other ideas on how I can scrape the table from that webpage?
EDIT -
I tried
from selenium import webdriver
driver = webdriver.PhantomJS()
driver.get("https://steria.taleo.net/careersection/in_cs_ext_fs/jobsearch.ftl?lang=en&radiusType=K&location=462170431401&searchExpanded=true&radius=1")
print(driver.find_element_by_class_name('table').text)
driver.close()
Upvotes: 1
Views: 254
Reputation: 52665
As table content generated dynamically, you should wait until JavaScript
executed to be able to get required data:
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait as wait
from selenium import webdriver
driver = webdriver.PhantomJS()
driver.get("https://steria.taleo.net/careersection/in_cs_ext_fs/jobsearch.ftl?lang=en&radiusType=K&location=462170431401&searchExpanded=true&radius=1")
table = wait(driver, 10).until(EC.presence_of_element_located(("xpath", "//table[@id='jobs' and ./tbody/tr]")))
print(table.text)
next_button = driver.find_element_by_link_text("Next")
next_button.click()
wait(driver, 5).until(lambda x: next_button.get_attribute("aria-disabled") == "true")
table = wait(driver, 10).until(EC.presence_of_element_located(("xpath", "//table[@id='jobs' and ./tbody/tr]")))
print(table.text)
driver.close()
Upvotes: 3
Reputation: 1194
You can try Beautiful Soup, look at this article: http://srome.github.io/Parsing-HTML-Tables-in-Python-with-BeautifulSoup-and-pandas/
Upvotes: 0