frysauce
frysauce

Reputation: 65

Can't find an HTML Table using BeautifulSoup in Python

I've been trying to parse an HTML table from the following URL (http://www.bmfbovespa.com.br/pt_br/servicos/market-data/consultas/mercado-de-derivativos/precos-referenciais/taxas-referenciais-bm-fbovespa/) but I can't find it using find_all.

The table has the id = 'tb_principal1'. When I try to use the following code, i keep getting an empty list.

import requests
from bs4 import BeautifulSoup

url = 'http://www.bmfbovespa.com.br/pt_br/servicos/market-data/consultas/mercado-de-derivativos/precos-referenciais/taxas-referenciais-bm-fbovespa/'

r = requests.get(url)
soup = BeautifulSoup(r.text,'lxml')
soup.find_all(id = 'tb_principal1')

I tried some solutions that i found here but I can't find the table. Does anyone have experienced something similar? Could it be a problem with encoder?

I appreciate your help.

Upvotes: 0

Views: 784

Answers (1)

Gasvom
Gasvom

Reputation: 651

After a quick look, the table in the page you referenced is actually coming through an iframe from a different page - http://www2.bmf.com.br/pages/portal/bmfbovespa/lumis/lum-taxas-referenciais-bmf-ptBR.asp. If you run the same code on that base url, you should get the expected result -

import requests
from bs4 import BeautifulSoup

url = 'http://www2.bmf.com.br/pages/portal/bmfbovespa/lumis/lum-taxas-referenciais-bmf-ptBR.asp'

r = requests.get(url)
soup = BeautifulSoup(r.text,'lxml')
soup.find_all(id = 'tb_principal1')

output

[<table id="tb_principal1">
<thead>
<tr>
...
</table>]

For reference, the easiest way I know to do this is by using the "sources" tab in the chrome page inspector. If you look a few divs above the table elements in the standard inspect element view, you'll see a form element with an action referencing that page also.

Upvotes: 2

Related Questions