Reputation: 1
When running the following code I get an error:
'str' object has no attribute 'text'
import requests
import pandas as pd
from bs4 import BeautifulSoup
baseURL= 'https://www.olx.pl/nieruchomosci/mieszkania/sprzedaz/pomorskie/'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:89.0) Gecko/20100101 Firefox/89.0'}
offer_links = []
for x in range (1,2):
r = requests.get(f'https://www.olx.pl/nieruchomosci/mieszkania/sprzedaz/pomorskie/?page={x}')
soup = BeautifulSoup(r.content, 'lxml')
offer_list= soup.find_all('div', class_='space rel')
for item in offer_list:
for link in item.find_all('a', href=True):
offer_links.append(link['href'])
print(offer_links)
for link in offer_links:
r= requests.get(link, headers=headers)
soup=BeautifulSoup(r.content, 'lxml')
nazwa = soup.find('h1',class_='css-1oarkq2-Text eu5v0x0').text.strip()
szczegoly = soup.find('p',class_='css-xl6fe0-Text eu5v0x0').text.strip()
cenazam2=[]
poziom=[]
umeblowane=[]
rynek=[]
zabudowa=[]
powi=[]
for i in range(0,7):
p=szczegoly[i].text.strip()
if("Cena" in p):
cenazam2.append(p)
elif("Poziom" in p or "SSD" in p):
poziom.append(p)
elif("Umeblowane" in p):
umeblowane.append(p)
elif("Rynek" in p):
rynek.append(p)
elif("zabudowy" in p):
zabudowa.append(p)
elif("Powierzchnia" in p):
powi.append(p)
oferty = {'Nazwa':nazwa,'Cena':cenazam2,'Poziom':poziom,'Umeblowane':umeblowane,'Rynek':rynek,'Zabudowa':zabudowa,'Powierzchnia':powi}
dataset = pd.DataFrame.from_dict(oferty, orient='index')
dataset = dataset.transpose()
dataset
I need something like an attachment. It's for one test link. It works for one link but I would like to automate it to paste data for the selected number of links.
Upvotes: 0
Views: 1340
Reputation: 3116
szczegoly
is a list of string. You assign it here:
szczegoly = soup.find('p',class_='css-xl6fe0-Text eu5v0x0').text.strip()
So you can no longer access the PageElements
attributes (.text
) here:
p=szczegoly[i].text.strip()
You have to decide where you want to strip the values:
szczegoly = soup.find('p',class_='css-xl6fe0-Text eu5v0x0')
...
p=szczegoly[i].text.strip()
or
szczegoly = soup.find('p',class_='css-xl6fe0-Text eu5v0x0').text.strip()
...
p=szczegoly[i]
These should work, when the element/index exists.
Upvotes: 1