Reputation: 47
The website below has several tables, but my code is not being able to get a specific one (nor any other table).
The code aims to get data from table "Ações em Circulação no Mercado" -> one of the last tables from webpage.
I have tried the code below and some alternatives, but none worked for me:
import pandas as pd
from selenium import webdriver
from time import sleep
url = "http://bvmf.bmfbovespa.com.br/cias-Listadas/Empresas-Listadas/BuscaEmpresaListada.aspx?idioma=pt-br"
Ticker='ITUB4'
browser = webdriver.Chrome()
browser.get(url)
sleep(2) #Wait webpage to load
browser.find_element_by_xpath(('//*[@id="ctl00_contentPlaceHolderConteudo_BuscaNomeEmpresa1_txtNomeEmpresa_txtNomeEmpresa_text"]')).send_keys(Ticker)
browser.find_element_by_xpath(('//*[@id="ctl00_contentPlaceHolderConteudo_BuscaNomeEmpresa1_btnBuscar"]')).click();
sleep(2) #Wait webpage to load
browser.find_element_by_xpath(('//*[@id="ctl00_contentPlaceHolderConteudo_BuscaNomeEmpresa1_grdEmpresa_ctl01"]/tbody/tr/td[1]/a')).click();
sleep(5) #Wait webpage to load
#This is not working
content = browser.find_element_by_css_selector('//div[@id="div1"]')
#This is not working as well
#browser.find_element_by_xpath('//*[@id="div1"]/div/div/div[1]/table/tbody/tr[1]/td[1]').text
The Table and Full HTML can be found here:
HTML is:
<div id="div1">
<div>
<h3>Ações em Circulação no Mercado</h3>
<div class="table-wrapper"><div class="scrollable"><table class="responsive">
<thead>
<tr>
<th colspan="3" class="text-center">19/04/2017</th>
</tr>
<tr>
<td>Tipos de Investidores / Ações</td>
<td class="text-center">Quantidade</td>
<td class="text-center">Percentual</td>
</tr>
</thead>
<tbody><tr>
<td>Pessoas Físicas</td>
<td class="text-right">108.853</td>
<td class="text-right"> - </td>
</tr>
<tr>
<td>Pessoas Jurídicas</td>
<td class="text-right">11.591</td>
<td class="text-right"> - </td>
</tr>
<tr>
<td>Investidores Institucionais</td>
<td class="text-right">1.039</td>
<td class="text-right"> - </td>
</tr>
<tr>
<td>Quantidade de Ações Ordinárias</td>
<td class="text-right">272.710.309</td>
<td class="text-right">8,21</td>
</tr>
<tr>
<td>Quantidade de Ações Preferenciais</td>
<td class="text-right">3.141.058.175</td>
<td class="text-right">97,23</td>
</tr>
<tr>
<td>Total de Ações</td>
<td class="text-right">3.413.768.484</td>
<td class="text-right">52,11</td>
</tr>
</tbody></table></div><div class="pinned"></div></div>
</div>
</div>
Upvotes: 0
Views: 2482
Reputation: 193098
To locate the WebElement and extract the text Pessoas Fisicas you can use the following line of code :
content = driver.find_element_by_xpath("//h3[.,'Ações em Circulação no Mercado']//following::div[1]//table[@class='responsive']//tr//following-sibling::td[1]").get_attribute("innerHTML")
The xpath
expression :
//h3[.,'Ações em Circulação no Mercado']//following::div[1]//table[@class='responsive']//tr//following-sibling::td[1]
Shouldn't be within single quotes e.g. 'xpath_here'
. Put the xpression with in double quote e.g. "xpath_here"
See the working snapshot :
Upvotes: 1
Reputation: 2690
One quick correction you can make is to change this content = browser.find_element_by_css_selector('//div[@id="div1"]')
to content = browser.find_element_by_xpath('//div[@id="div1"]')
because it actually is an xpath you're using.
The reason the second attempt is not working might be that the div1 element is not scrolled into view. Selenium does not interact well with elements that are not visible. So try this:
element = browser.find_element_by_xpath('//*[@id="div1"]')
# Force the element to be scrolled into view, even if you don't need its location.
location = element.location_once_scrolled_into_view
# Now Selenium can get its text.
text = element.text
Upvotes: 1
Reputation: 21
You wrote XPATH in CSS selector definition. You should locate
tables = browser.find_elements_by_css_selector('.responsive')
if you want all tables, and then parse from them. OR
Use browser.find_element_by_xpath(.//*[@id='div1']/div/table)
to locate exact table.
Upvotes: 1