How to append data in data frame

import requests
from bs4 import BeautifulSoup
import pandas as pd
headers= {'User-Agent': 'Mozilla/5.0'}

 #put all item in this array
response = requests.get('http://smartcatalog.emo-milano.com/it/espositore/a-mannesmann-maschinenfabrik-gmbh')
soup = BeautifulSoup(response.content, 'html.parser')
table=soup.find_all('table', class_='expo-table general-color')
for row in table:
         for up in row.find_all('td'):
             text_list = [text for text in up.stripped_strings]
             print(text_list)

These code is working good and they will get me the correct output but they will not give output in these format as you seen below I want output in these format can you help me

Indirizzo   Bliedinghauserstrasse 27
Città        Remscheid
Nazionalità   Germania
Sito web      www.amannesmann.de
Stand         Pad. 3 E14 F11
Telefono      +492191989-0
Fax          +492191989-201
E-mail       [email protected]
Membro di     Cecimo
Social

Upvotes: 0

Answers (3)

Ram

Reputation: 4779

Instead of selecting <td>, select <tr> and use .stripped_strings on it to get the row wise data and then append them to the Dataframe.

Here is the code

import requests
from bs4 import BeautifulSoup
import pandas as pd
headers= {'User-Agent': 'Mozilla/5.0'}

#put all item in this array
temp = []
response = requests.get('http://smartcatalog.emo-milano.com/it/espositore/a-mannesmann-maschinenfabrik-gmbh')
soup = BeautifulSoup(response.content, 'html.parser')
table=soup.find_all('table', class_='expo-table general-color')
for row in table:
    for up in row.find_all('tr'):
        temp.append([text for text in up.stripped_strings])

df = pd.DataFrame(temp)
print(df)

             0                         1
0    Indirizzo  Bliedinghauserstrasse 27
1        Città                 Remscheid
2  Nazionalità                  Germania
3     Sito web        www.amannesmann.de
4        Stand            Pad. 3 E14 F11
5     Telefono              +492191989-0
6          Fax            +492191989-201
7       E-mail       [email protected]
8    Membro di                      None
9       Social                      None

Upvotes: 0

RJ Adriaansen

Reputation: 9619

pandas has a builtin html table scraper, so you can run:

df = pd.read_html('http://smartcatalog.emo-milano.com/it/espositore/a-mannesmann-maschinenfabrik-gmbh')

This returns a list of all tables on the page as dataframes, you can access your data with df[0]:

	0	1
0	Indirizzo	Bliedinghauserstrasse 27
1	Città	Remscheid
2	Nazionalità	Germania
3	Sito web	www.amannesmann.de
4	Stand	Pad. 3 E14 F11
5	Telefono	+492191989-0
6	Fax	+492191989-201
7	E-mail	[email protected]
8	Membro di	nan
9	Social	nan

Upvotes: 1

Bhavya Parikh

Reputation: 3400

You can use .get_text() method to extract text and use parameters to avoid whitespaces and give extra space using separator

data=table.find_all("tr")
for i in data:
    print(i.get_text(strip=True,separator=" "))

Output:

Indirizzo Bliedinghauserstrasse 27
Città Remscheid
...

Upvotes: 0

How to append data in data frame

Answers (3)

Related Questions