requests and Beautifulsoup

Question

I'm trying to pull just one data from the web table. how could i make this code to pull only one data inside the table?

I'm trying to pull just the value 0.83 how could I do that?

    import requests
    from bs4 import BeautifulSoup
    
    
    url = 'https://www.gov.br/receitafederal/pt-br/assuntos/orientacao-`tributaria/pagamentos-e-parcelamentos/taxa-de-juros-selic#Taxa_de_Juros_Selic'`
    
    headers = {"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.67 Safari/537.36"}
    
    
    page = requests.get(url ,headers=headers)
    
    
    #print(page.content)
    #span class = DFlfde SwHCTb
    soup = BeautifulSoup(page.content, "html.parser")
    
    valor_taxa = soup.find_all("table",class_ ="listing" )[0]
    valor_tr = soup.find_all("tr",class_="odd")
    valor_especifico = soup.select('td', class_={'align': 'CENTER'})
    
    
    print(valor_especifico)

The output is:

C:\Users\Francisco\PycharmProjects\INSS\Scripts\python.exe C:/Users/Francisco/PycharmProjects/INSS/web.py
[



,  Taxa de Juros Selic, 



,  Taxa de Juros Selic Acumulada Mensalmente, 



,  Taxa de Juros Selic Incidente sobre as Quotas do Imposto de Renda Pessoa Física, Mês/Ano, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022, Janeiro, 0,60%, 0,85%, 0,94%, 1,06%, 1,09%, 0,58%, 0,54%, 0,38%, 0,15%, 0,73%, Fevereiro, 0,49%, 0,79%, 0,82%, 1,00%, 0,87%, 0,47%, 0,49%, 0,29%, 0,13%, 0,76%, Março, 0,55%, 0,77%, 1,04%, 1,16%, 1,05%, 0,53%, 0,47%, 0,34%, 0,20%, 0,93%, Abril, 0,61%, 0,82%, 0,95%, 1,06%, 0,79%, 0,52%, 0,52%, 0,28%, 0,21%, 0,83%, Maio, 0,60%, 0,87%, 0,99%, 1,11%, 0,93%, 0,52%, 0,54%, 0,24%, 0,27%, , Junho, 0,61%, 0,82%, 1,07%, 1,16%, 0,81%, 0,52%, 0,47%, 0,21%, 0,31%, , Julho, 0,72%, 0,95%, 1,18%, 1,11%, 0,80%, 0,54%

Process finished with exit code 0

Adon Bilivit · Accepted Answer

There are more succinct ways to do this but by breaking down into individual steps may make it clearer.

Do a GET on the URL and check the HTTP status.

Build 'soup' from the response text.

Iterate over each table, tr and td finally printing all the text associated with the lower level tds.

import requests
from bs4 import BeautifulSoup as BS

(r := requests.get('https://www.gov.br/receitafederal/pt-br/assuntos/orientacao-tributaria/pagamentos-e-parcelamentos/taxa-de-juros-selic#Taxa_de_Juros_Selic')).raise_for_status()

soup = BS(r.text, 'lxml')

for table in soup.find_all('table', {'class': 'listing'}):
    for tr in table.find_all('tr', {'class': 'odd'}):
        for td in tr.find_all('td', {'align': 'CENTER'}):
            print(td.text)

requests and Beautifulsoup <tables>

Answers (2)

Related Questions

requests and Beautifulsoup &lt;tables&gt;

Answers (2)

Related Questions

requests and Beautifulsoup <tables>