jake wong
jake wong

Reputation: 5228

Pandas read html only returning 1 table when 5 tables r found

I am trying to use selenium to scrape tables off the weblink below. However, pandas seems to only be returning the first table, and not all tables.

weblink = 'http://sgx.com/wps/portal/sgxweb/home/company_disclosure/stockfacts?page=2&code=A68U&lang=en-us'
path_to_chromedriver = r'C:\chromedriver.exe'
driver = webdriver.Chrome(executable_path=path_to_chromedriver)
driver.get(weblink)
wait = WebDriverWait(driver, 8)
# locate and switch to the iframe
iframe = driver.find_element_by_css_selector("#mainContent iframe")
driver.switch_to.frame(iframe)

wait.until(EC.visibility_of_element_located((By.ID, 'financials')))  # Should I be using this?

print(pandas.read_html(driver.page_source, flavor='bs4'))

How can I get pandas to print out all tables instead of just the first one?

Upvotes: 0

Views: 415

Answers (1)

JeffC
JeffC

Reputation: 25611

Did you look at the content of the TABLE that was returned? It actually contains all 5 "tables". What looks like separate TABLE tags on the page is actually just one and 5 TBODYs formatted to look like separate TABLEs. You should familiarize yourself with the browser's dev console. I would recommend Chrome's. Right click on an element inside the table and choose Inspect from the context menu. Now hover over the elements in the dev console and the browser will highlight the element on the webpage. This is a good way to correlate elements in the HTML with elements on the page. If you do this on this page, you will see that there is only one TABLE tag and each TBODY looks like a separate TABLE.

Upvotes: 1

Related Questions