Reputation: 137
When writing data to a csv file with Pandas, I used to use the method below. It still works, but throws this warning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
import requests
import pandas as pd
from bs4 import BeautifulSoup
url = "https://www.breuninger.com/de/damen/luxus/bekleidung-jacken-maentel/"
headers = {
"User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.5005.61 Safari/537.36",
}
res = requests.get(url, headers=headers)
soup = BeautifulSoup(res.text,"lxml")
df = pd.DataFrame(columns=["Marke","Name","Preis"])
for item in soup.select(".suchen-produkt a"):
marke = item.select_one(".suchen-produkt__marke").get_text()
name = item.select_one(".suchen-produkt__name").get_text()
preis = item.select_one(".suchen-produkt__preis").get_text()
df = df.append({'Marke':marke,'Name':name,'Preis':preis}, ignore_index=True)
print(df)
df.to_csv("products.csv", index=False)
How can I use concat
in place of append
while keeping the same scraping logic intact?
Upvotes: -1
Views: 114
Reputation: 463
These are warnings and are not dangerous. Use:
import warnings
warnings.filterwarnings('ignore')
I still use append, but it's a relatively difficult way concat:
parameters = ['a', 'b', 'c', 'd', 'e', 'f']
df = pd.DataFrame(columns=parameters)
new_row = pd.DataFrame([1,2,3,4,5,6], columns=['row1'], index=parameters).T
df = pd.concat((df, new_row)) კი
Upvotes: -1
Reputation: 3706
Heres an example:
dfs = []
for item in soup.select(".suchen-produkt a"):
marke = item.select_one(".suchen-produkt__marke").get_text()
name = item.select_one(".suchen-produkt__name").get_text()
preis = item.select_one(".suchen-produkt__preis").get_text()
dfs.append(pd.DataFrame([{'Marke': marke, 'Name': name, 'Preis': preis}]))
final = pd.concat(dfs).reset_index(drop=True)
print(final)
Or you can append as dict and convert to df at the end:
data = []
for item in soup.select(".suchen-produkt a"):
marke = item.select_one(".suchen-produkt__marke").get_text()
name = item.select_one(".suchen-produkt__name").get_text()
preis = item.select_one(".suchen-produkt__preis").get_text()
data.append({'Marke': marke, 'Name': name, 'Preis': preis})
final = pd.DataFrame(data)
print(final)
Upvotes: 2