Abhishek Kulkarni
Abhishek Kulkarni

Reputation: 3818

How to read table data using selenium python?

Following is the table HTML source which seems to be very complex for selenium to read its contents.. Can somebody help me, reading this data into python using selenium?

<div class="general_table">
    <div class="general_s">
        <div class="general_text1">Name</div>
        <div class="general_text2">Abhishek</div>
    </div>
    <div class="general_m">
        <div class="general_text1">Last Name</div>
        <div class="general_text2">Kulkarni</div>
    </div>
    <div class="general_s">
        <div class="general_text1">Phone</div>
        <div class="general_text2"> 13613123</div>
    </div>
    <div class="general_m">
        <div class="general_text1">Cell Phone</div>
        <div class="general_text2">82928091</div>
    </div>         
    <div class="general_s">
        <div class="general_text1">City</div>
        <div class="general_text2"></div>
    </div>
    <div class="general_m">
        <div class="general_text1">Model</div>
        <div class="general_text2"> DELL PERC H700</div>
    </div>
</div>

Upvotes: 3

Views: 17647

Answers (2)

Hari Reddy
Hari Reddy

Reputation: 3858

To read this table using selenium webdriver, xpath seems to be the easy way -

I'm do not know python properly so the code might be wrong but the idea seems to be right -

To find out the number of div tags with in the general_table we use the xpath -

driver.find_elements_by_xpath(("//*[@class='general_table']/div") which will return a List with size - 6.

Then you can loop through each of the elements using a loop -

for(int i=1;i<=list.length;i++){
    String text1 = driver.find_element_by_xpath("//*[@class='general_table']/div["+i+"]/div[1]").text;
    String text2 = driver.find_element_by_xpath("//*[@class='general_table']/div["+i+"]/div[2]").text;
}

You can read all the tags in the table by this way.

Upvotes: 3

Paulo Scardine
Paulo Scardine

Reputation: 77251

Use selenium to grab the page source (so you get the real content after all the js/ajax stuff) and something like BeautifulSoup to parse it.

from bs4 import BeautifulSoup

soup = BeautifulSoup("""<div class="general_table">
    <div class="general_s">
        <div class="general_text1">Name</div>
        <div class="general_text2">Abhishek</div>
    </div>
    <div class="general_m">
        <div class="general_text1">Last Name</div>
        <div class="general_text2">Kulkarni</div>
    </div>
    <div class="general_s">
        <div class="general_text1">Phone</div>
        <div class="general_text2"> 13613123</div>
    </div>
    <div class="general_m">
        <div class="general_text1">Cell Phone</div>
        <div class="general_text2">82928091</div>
    </div>         
    <div class="general_s">
        <div class="general_text1">City</div>
        <div class="general_text2"></div>
    </div>
    <div class="general_m">
        <div class="general_text1">Model</div>
        <div class="general_text2"> DELL PERC H700</div>
    </div>
</div>""")

def tags(iterable):
    return filter(lambda x: not isinstance(x, basestring), iterable)

for table in soup.find_all('div', {'class': 'general_table'}):
    for line in tags(table.contents):
        for i, column in enumerate(tags(line.contents)):
            if column.string:
                print column.string.strip(),
            if i:
                print ',',
            else:
                print ':',
        print ''    

Result:

Name : Abhishek , 
Last Name : Kulkarni , 
Phone : 13613123 , 
Cell Phone : 82928091 , 
City : 
Model : DELL PERC H700 , 

Upvotes: 1

Related Questions