Conrado Manclossi
Conrado Manclossi

Reputation: 5

Python scraping selenium - table data not in code

I am trying to extract the table data from this page.

Tried with bs4 and selenium, but the table data does not appear in the code, tried the wait mode in selenium also did not give.

from selenium import webdriver
url = 'https://www.rad.cvm.gov.br/ENETCONSULTA/frmGerenciaPaginaFRE.aspx?NumeroSequencialDocumento=82594&CodigoTipoInstituicao=2'
driver = webdriver.Safari()
driver.get(url)
iframe = driver.find_element_by_tag_name('iframe')
driver.switch_to.frame(iframe)
driver.page_source

Upvotes: 0

Views: 97

Answers (1)

sleeping_coder
sleeping_coder

Reputation: 99

There is pandas to help you out. I did this. The output looks better though. You may need to install lxml first. so,first

!pip3 install lxml

then

import pandas as pd
from selenium import webdriver
url = 'https://www.rad.cvm.gov.br/ENETCONSULTA/frmGerenciaPaginaFRE.aspx?NumeroSequencialDocumento=82594&CodigoTipoInstituicao=2'
driver = webdriver.Chrome()
driver.get(url)
iframe = driver.find_element_by_tag_name('iframe')
driver.switch_to.frame(iframe)


dfs = pd.read_html(driver.page_source)
print(dfs[0].head())

#output
      0                                                  1  \
0    Conta                                          Descrição   
1     3.01               Receitas da Intermediação Financeira   
2  3.01.01                     Receita de Juros e Rendimentos   
3  3.01.02                              Receita de Dividendos   
4  3.01.03  Resultado de Operações de Câmbio e Variação Ca...   

                         2                        3  
0  01/01/2019 a 31/03/2019  01/01/2018 a 31/03/2018  
1               17.010.000               16.856.000  
2                6.142.000                5.973.000  
3                      NaN                      NaN  
4                  303.000                 -145.000  

Upvotes: 2

Related Questions