Reputation: 3818
Following is the table HTML source which seems to be very complex for selenium to read its contents.. Can somebody help me, reading this data into python using selenium?
<div class="general_table">
<div class="general_s">
<div class="general_text1">Name</div>
<div class="general_text2">Abhishek</div>
</div>
<div class="general_m">
<div class="general_text1">Last Name</div>
<div class="general_text2">Kulkarni</div>
</div>
<div class="general_s">
<div class="general_text1">Phone</div>
<div class="general_text2"> 13613123</div>
</div>
<div class="general_m">
<div class="general_text1">Cell Phone</div>
<div class="general_text2">82928091</div>
</div>
<div class="general_s">
<div class="general_text1">City</div>
<div class="general_text2"></div>
</div>
<div class="general_m">
<div class="general_text1">Model</div>
<div class="general_text2"> DELL PERC H700</div>
</div>
</div>
Upvotes: 3
Views: 17647
Reputation: 3858
To read this table using selenium webdriver, xpath seems to be the easy way -
I'm do not know python properly so the code might be wrong but the idea seems to be right -
To find out the number of div tags with in the general_table
we use the xpath -
driver.find_elements_by_xpath(("//*[@class='general_table']/div")
which will return a List with size - 6.
Then you can loop through each of the elements using a loop -
for(int i=1;i<=list.length;i++){
String text1 = driver.find_element_by_xpath("//*[@class='general_table']/div["+i+"]/div[1]").text;
String text2 = driver.find_element_by_xpath("//*[@class='general_table']/div["+i+"]/div[2]").text;
}
You can read all the tags in the table by this way.
Upvotes: 3
Reputation: 77251
Use selenium to grab the page source (so you get the real content after all the js/ajax stuff) and something like BeautifulSoup to parse it.
from bs4 import BeautifulSoup
soup = BeautifulSoup("""<div class="general_table">
<div class="general_s">
<div class="general_text1">Name</div>
<div class="general_text2">Abhishek</div>
</div>
<div class="general_m">
<div class="general_text1">Last Name</div>
<div class="general_text2">Kulkarni</div>
</div>
<div class="general_s">
<div class="general_text1">Phone</div>
<div class="general_text2"> 13613123</div>
</div>
<div class="general_m">
<div class="general_text1">Cell Phone</div>
<div class="general_text2">82928091</div>
</div>
<div class="general_s">
<div class="general_text1">City</div>
<div class="general_text2"></div>
</div>
<div class="general_m">
<div class="general_text1">Model</div>
<div class="general_text2"> DELL PERC H700</div>
</div>
</div>""")
def tags(iterable):
return filter(lambda x: not isinstance(x, basestring), iterable)
for table in soup.find_all('div', {'class': 'general_table'}):
for line in tags(table.contents):
for i, column in enumerate(tags(line.contents)):
if column.string:
print column.string.strip(),
if i:
print ',',
else:
print ':',
print ''
Result:
Name : Abhishek ,
Last Name : Kulkarni ,
Phone : 13613123 ,
Cell Phone : 82928091 ,
City :
Model : DELL PERC H700 ,
Upvotes: 1