Oberdan O. Fonseca
Oberdan O. Fonseca

Reputation: 25

Error: Requests and lxml libraries return empty brackets in web scraping

I have a problem using Requests and lxml libraries to do webscraping in Python.

I need to capture the information in yellow from the website (http://www.b3.com.br/pt_br/market-data-e-indices/indices/indices-amplos/indice-ibovespa-ibovespa-composicao-da-carteira.htm). However, this returns: []

Please, could someone help me?

send the code below

from lxml import html
import requests
 
page = requests.get('http://www.b3.com.br/pt_br/market-data-e-indices/indices/indices-amplos/indice-ibovespa-ibovespa-composicao-da-carteira.htm')
tree = html.fromstring(page.content)
 
cod = tree.xpath('//*[@id="divContainerIframeB3"]/div/div[1]/form/div[2]/div/table/tbody/tr[1]/td[1]')
 
print('The code is : ', cod)

Image of return: enter image description here

Inspect Browser: enter image description here

Upvotes: 1

Views: 120

Answers (1)

Andrej Kesely
Andrej Kesely

Reputation: 195478

The data is loaded via Javascript from external source. You can use this script to load the Json data:

import json
import base64
import requests


api_url = "https://sistemaswebb3-listados.b3.com.br/indexProxy/indexCall/GetPortfolioDay/{encoded_string}"

page = 1
index = "IBOV"

s = {
    "language": "pt-br",
    "pageNumber": page,
    "pageSize": 20,
    "index": index,
    "segment": "1",
}

encoded_string = base64.b64encode(str(s).encode("utf-8")).decode("utf-8")

data = requests.get(
    api_url.format(encoded_string=encoded_string),
    verify=False,
).json()

# uncomment this to get all data:
# print(json.dumps(data, indent=4))

for result in data["results"]:
    print(
        "{:<8} {:<15} {:15}".format(
            result["cod"], result["asset"], result["theoricalQty"]
        )
    )

Prints:

ABEV3    AMBEV S/A       4.355.174.839  
ASAI3    ASSAI           157.635.935    
AZUL4    AZUL            327.283.207    
BTOW3    B2W DIGITAL     201.549.295    
B3SA3    B3              1.930.877.944  
BBSE3    BBSEGURIDADE    671.584.841    
BRML3    BR MALLS PAR    843.728.684    
BBDC3    BRADESCO        1.261.986.269  
BBDC4    BRADESCO        4.687.814.597  
BRAP4    BRADESPAR       222.075.664    
BBAS3    BRASIL          1.283.197.221  
BRKM5    BRASKEM         264.640.575    
BRFS3    BRF SA          811.759.800    
BPAC11   BTGP BANCO      263.871.572    
CRFB3    CARREFOUR BR    391.758.726    
CCRO3    CCR SA          1.115.695.556  
CMIG4    CEMIG           969.723.092    
HGTX3    CIA HERING      126.186.408    
CIEL3    CIELO           1.112.196.638  
COGN3    COGNA ON        1.847.994.874  

Upvotes: 1

Related Questions