Reputation: 13
I want to get the tables form this link: https://radarintermediacoes.com.br/compra-e-venda-de-negocios/estacionamento-no-centro-17/
I am trying to get the information through the following code:
import pandas as pd
import requests
url = "https://radarintermediacoes.com.br/compra-e-venda-de-negocios/estacionamento-no-centro-17/"
header = {
"User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.75 Safari/537.36",
"X-Requested-With": "XMLHttpRequest"
}
r = requests.get(url, headers=header)
data = pd.read_html(r.text)
data
With this code, I manage to get only one of the tables I want, the table "DESPESAS FIXAS/VARIÁVEIS", but I also want the table "DETALHES DO NEGÓCIO".
I hope I can get some help or suggestions, thanks!
Upvotes: 1
Views: 881
Reputation: 195438
The data you're looking for isn't inside <table>
tag, so Pandas doesn't see it. But you can parse the data with BeautifulSoup
for example:
import requests
import pandas as pd
from bs4 import BeautifulSoup
url = "https://radarintermediacoes.com.br/compra-e-venda-de-negocios/estacionamento-no-centro-17/"
soup = BeautifulSoup(requests.get(url).content, 'html.parser')
all_data = []
for li in soup.select('div.panel-heading:contains("Detalhes do Negócio") + div li'):
a, v = li.strong.text, li.strong.find_next_sibling(text=True)
all_data.append({'Attribute': a.strip(':'), 'Value': v})
df = pd.DataFrame(all_data)
print(df)
Prints:
Attribute Value
0 TIPO DE NEGÓCIO Estacionamentos
1 REGIÃO Centro
2 FATURAMENTO MENSAL R$ 13.500,00
3 LUCRO LÍQUIDO R$ 3.000,00
4 NÚMERO DE FUNCIONÁRIOS 1
5 TEMPO DE CONTRATO 3 anos
6 HORÁRIO DE FUNCIONAMENTO Segunda a sexta ds 07:00 as 19:00
7 CONDIÇÕES DE PAGAMENTO 50% de entrada e restante em 30 dias
8 PREÇO R$50.000,00
Upvotes: 1