Reputation: 5228
I am trying to use selenium to scrape tables off the weblink below. However, pandas seems to only be returning the first table, and not all tables.
weblink = 'http://sgx.com/wps/portal/sgxweb/home/company_disclosure/stockfacts?page=2&code=A68U&lang=en-us'
path_to_chromedriver = r'C:\chromedriver.exe'
driver = webdriver.Chrome(executable_path=path_to_chromedriver)
driver.get(weblink)
wait = WebDriverWait(driver, 8)
# locate and switch to the iframe
iframe = driver.find_element_by_css_selector("#mainContent iframe")
driver.switch_to.frame(iframe)
wait.until(EC.visibility_of_element_located((By.ID, 'financials'))) # Should I be using this?
print(pandas.read_html(driver.page_source, flavor='bs4'))
How can I get pandas to print out all tables instead of just the first one?
Upvotes: 0
Views: 415
Reputation: 25611
Did you look at the content of the TABLE
that was returned? It actually contains all 5 "tables". What looks like separate TABLE
tags on the page is actually just one and 5 TBODY
s formatted to look like separate TABLE
s. You should familiarize yourself with the browser's dev console. I would recommend Chrome's. Right click on an element inside the table and choose Inspect from the context menu. Now hover over the elements in the dev console and the browser will highlight the element on the webpage. This is a good way to correlate elements in the HTML with elements on the page. If you do this on this page, you will see that there is only one TABLE
tag and each TBODY
looks like a separate TABLE
.
Upvotes: 1