Reputation: 23
I'm new to beautifulsoup and I'm trying to scrape the price for the car. The issue with the price is it comes back 2 values in the div tag. At the moment I was able to find the rest of the attributes but I cant seem to find the price.
MY CODE(loops through the div tag and finds the specific attribute for the car)
import requests, lxml.html, csv
from bs4 import BeautifulSoup
url = requests.get("https://www.carsireland.ie/used-cars/bmw")
content = url.content
pri = lxml.html.fromstring(url.content)
soup = BeautifulSoup(content, 'lxml')
rows = soup.find_all("div", {"class": "listing__details listing__details--desktop"})
# write headers
for row in rows:
carname = row.find('h2').text.strip()
carlocation = row.find('div', {"class": "listing__details-location"}).text.strip()
carmileage = row.find('div', {"class": "listing__details-data-mileage"}).text.strip()
carcolour = row.find('div', {"class": "listing__details-color"}, 'p').text.strip()
caryear = row.find('div', {"class": "listing__details-data-year"}, 'p').text.strip()
carprice = row.find('div', {"class": "listing__details-private-seller"}).find_previous()
print(carprice)
This is the HTML code for rows, This is the div I used to locate the other attributes.
<div class="listing__details listing__details--desktop">
<div class="listing__details-location">
Meath
</div>
<div class="listing__details-vehicle">
<h2>BMW 316</h2>
<p>316I ES Z3SQ 4DR E90 SALOON N45 1.6</p>
</div>
<div class="listing__details-data">
<div class="listing__details-data-year">
<p>2007</p>
</div>
<div class="listing__details-data-mileage">
309 km
</div>
</div>
<div class="listing__details-pricing">
€900
<div class="listing__details-private-seller">Private</div>
</div>
<div class="listing__details-color">
<span class="" style="background-color: black;"></span>
<p>BLACK</p>
</div>
</div>
Upvotes: 0
Views: 48
Reputation: 1
Use class: "listing__details-pricing" instead of class: "listing__details-private-seller" to get carprice
Modified code:
for row in rows:
carname = row.find('h2').text.strip()
carlocation = row.find('div', {"class": "listing__details-location"}).text.strip()
carmileage = row.find('div', {"class": "listing__details-data-mileage"}).text.strip()
carcolour = row.find('div', {"class": "listing__details-color"}, 'p').text.strip()
caryear = row.find('div', {"class": "listing__details-data-year"}, 'p').text.strip()
carprice = row.find('div', {"class": "listing__details-pricing"}).text.strip().split(' ')[0].strip()
print(carprice)
Upvotes: 0
Reputation: 1122
Use re
module to get exactly number or strings
attribute, which returns generator:
import re
# ...
for row in rows:
price = float(re.sub("[^0-9\.]", "", row.find('div', {"class": "listing__details-pricing"}).text))
print(price) # returns 900
# or
price = next(row.find('div', {"class": "listing__details-pricing"}).strings).strip()
print(price) # returns "€900"
Upvotes: 1