Ricardo
Ricardo

Reputation: 47

Unable to get table element in website using Selenium

The website below has several tables, but my code is not being able to get a specific one (nor any other table).

The code aims to get data from table "Ações em Circulação no Mercado" -> one of the last tables from webpage.

I have tried the code below and some alternatives, but none worked for me:

import pandas as pd
from selenium import webdriver
from time import sleep

url = "http://bvmf.bmfbovespa.com.br/cias-Listadas/Empresas-Listadas/BuscaEmpresaListada.aspx?idioma=pt-br"
Ticker='ITUB4'
browser = webdriver.Chrome()
browser.get(url)
sleep(2) #Wait webpage to load
browser.find_element_by_xpath(('//*[@id="ctl00_contentPlaceHolderConteudo_BuscaNomeEmpresa1_txtNomeEmpresa_txtNomeEmpresa_text"]')).send_keys(Ticker)
browser.find_element_by_xpath(('//*[@id="ctl00_contentPlaceHolderConteudo_BuscaNomeEmpresa1_btnBuscar"]')).click();
sleep(2) #Wait webpage to load
browser.find_element_by_xpath(('//*[@id="ctl00_contentPlaceHolderConteudo_BuscaNomeEmpresa1_grdEmpresa_ctl01"]/tbody/tr/td[1]/a')).click();
sleep(5) #Wait webpage to load

#This is not working
content = browser.find_element_by_css_selector('//div[@id="div1"]')

#This is not working as well
#browser.find_element_by_xpath('//*[@id="div1"]/div/div/div[1]/table/tbody/tr[1]/td[1]').text

The Table and Full HTML can be found here:

Table

HTML

HTML is:

<div id="div1">
                <div>
                    <h3>Ações em Circulação no Mercado</h3>
                    <div class="table-wrapper"><div class="scrollable"><table class="responsive">

                        <thead>
                            <tr>
                                <th colspan="3" class="text-center">19/04/2017</th>
                            </tr>
                            <tr>
                                <td>Tipos de Investidores / Ações</td>
                                <td class="text-center">Quantidade</td>
                                <td class="text-center">Percentual</td>
                            </tr>
                        </thead>

                            <tbody><tr>
                                <td>Pessoas Físicas</td>
                                <td class="text-right">108.853</td>
                                <td class="text-right"> - </td>
                            </tr>

                            <tr>
                                <td>Pessoas Jurídicas</td>
                                <td class="text-right">11.591</td>
                                <td class="text-right"> - </td>
                            </tr>

                            <tr>
                                <td>Investidores Institucionais</td>
                                <td class="text-right">1.039</td>
                                <td class="text-right"> - </td>
                            </tr>

                            <tr>
                                <td>Quantidade de Ações Ordinárias</td>
                                <td class="text-right">272.710.309</td>
                                <td class="text-right">8,21</td>
                            </tr>

                            <tr>
                                <td>Quantidade de Ações Preferenciais</td>
                                <td class="text-right">3.141.058.175</td>
                                <td class="text-right">97,23</td>
                            </tr>

                            <tr>
                                <td>Total de Ações</td>
                                <td class="text-right">3.413.768.484</td>
                                <td class="text-right">52,11</td>
                            </tr>

                            </tbody></table></div><div class="pinned"></div></div>
                </div>
                </div>

Upvotes: 0

Views: 2482

Answers (3)

undetected Selenium
undetected Selenium

Reputation: 193098

To locate the WebElement and extract the text Pessoas Fisicas you can use the following line of code :

content = driver.find_element_by_xpath("//h3[.,'Ações em Circulação no Mercado']//following::div[1]//table[@class='responsive']//tr//following-sibling::td[1]").get_attribute("innerHTML")

Update (no code change)

The xpath expression :

//h3[.,'Ações em Circulação no Mercado']//following::div[1]//table[@class='responsive']//tr//following-sibling::td[1]

Shouldn't be within single quotes e.g. 'xpath_here'. Put the xpression with in double quote e.g. "xpath_here"

See the working snapshot :

tds

Upvotes: 1

Ron Norris
Ron Norris

Reputation: 2690

One quick correction you can make is to change this content = browser.find_element_by_css_selector('//div[@id="div1"]') to content = browser.find_element_by_xpath('//div[@id="div1"]') because it actually is an xpath you're using.

The reason the second attempt is not working might be that the div1 element is not scrolled into view. Selenium does not interact well with elements that are not visible. So try this:

element = browser.find_element_by_xpath('//*[@id="div1"]')
# Force the element to be scrolled into view, even if you don't need its location.
location = element.location_once_scrolled_into_view
# Now Selenium can get its text.
text = element.text

Upvotes: 1

Karen Avagyan
Karen Avagyan

Reputation: 21

You wrote XPATH in CSS selector definition. You should locate tables = browser.find_elements_by_css_selector('.responsive') if you want all tables, and then parse from them. OR Use browser.find_element_by_xpath(.//*[@id='div1']/div/table) to locate exact table.

Upvotes: 1

Related Questions