S Andrew
S Andrew

Reputation: 7208

How to get data from table using selenium in Python

I have this URL which has table in it. I need to get all the rows and column data from table from all the multiple pages. I am not able to understand how can I get data from the table. Below is the code I have:

from selenium import webdriver
import os
import time
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as ec
from selenium.webdriver.support.ui import Select
from pynput.keyboard import Key, Controller

curr_path = os.path.dirname(os.path.abspath(__file__))

keyboard = Controller()

driver = webdriver.Firefox()
driver.get("http://silk.dephut.go.id/index.php/info/iuiphhk")
driver.maximize_window()

Above code opens a firefox and loads up the url. Below code I am using to click on next page:

next_btn = (By.XPATH, "//div[@id='silk_content_wrapper']//ul[1]//li[4]//a[1]")
WebDriverWait(driver, 30).until(ec.element_to_be_clickable(next_btn)).click() 

But I am unable to understand how to get data from table. I am not from web development field so not able to understand the website code. I referred to this question accepted answer and I extracted the ID of the table:

table_id = driver.find_element(By.ID, 'diviuiphhk')

But I didnt find the ID of the rows to get the value. To find the ID,XPATH of any object on url, I use chropath. Can anyone please help me understand how to get data from the table. Please help. Thanks

Upvotes: 2

Views: 8780

Answers (2)

Rahul
Rahul

Reputation: 31

This will give you all the cells of the table and you can extract the data

driver.find_elements(By.XPATH, "//table[@class='table']/tbody/tr/td")

Upvotes: 1

S Andrew
S Andrew

Reputation: 7208

I was able to solve it. Below is the code:

table_id = driver.find_element(By.XPATH, "//table[@class='table']")

for row in range(1, 11):
    rows = table_id.find_elements(By.XPATH, "//body//tbody//tr[" + str(row) + "]")
    for row_data in rows:
        col = row_data.find_elements(By.TAG_NAME, "td")
        for i in range(len(col)):
            print(col[i].text)

First I used chropath to get the XPATH value of the table. Then I also got the XPATH of row. This XPATH of row was same for all the rows of table, just have to increase the number from 1 to 10. The column inside the rows was referred to bye td TAG NAME. So used this tag name to get the values of the column.

Thanks

Upvotes: 6

Related Questions