Reputation: 39
I'm trying to extract a div tag by class to find all the available listings on the website. Currently there are 37 listings, but my code is returning an empty list. What am I doing wrong here?
import requests
from bs4 import BeautifulSoup
url = 'https://www.premiervolvocarsoverlandpark.com/used-volvo/overland-park-ks.htm'
response = requests.get(url)
soup = BeautifulSoup(response.content, "html.parser")
requests = soup.find_all('div', {'class': 'vehicle-card-details-container'})
requests
Upvotes: 0
Views: 824
Reputation: 195438
The data you see is loaded via Javascript, so beautifulsoup
doesn't see it. You can simulate this request with requests
module. For example:
import json
import requests
url = "https://www.premiervolvocarsoverlandpark.com/apis/widget/SITEBUILDER_OVERLAND_PARK_KS_1:inventory-data-bus1/getInventory?start={}"
for page in range(0, 4):
u = url.format(10 * page)
data = requests.get(u).json()
# uncomment this to print all data:
# print(json.dumps(data, indent=4))
for i in data["inventory"]:
print("".join(i["title"]), i["pricing"]["retailPrice"])
Prints:
2019 Volvo S60 T5 Momentum Sedan $32,000
2019 Volvo S60 T5 Momentum Sedan $37,000
2019 Volvo S60 T6 Inscription Sedan $42,490
2020 Volvo S60 T6 Momentum Sedan $39,000
2019 Volvo S60 T6 Momentum Sedan $37,000
2018 Volvo S90 T5 Momentum Sedan $36,623
2022 Volvo V90 Cross Country B6 Wagon $62,205
2022 Volvo V90 Cross Country B6 Wagon $58,155
2021 Volvo XC40 R-Design SUV $45,000
2019 Volvo XC40 R-Design SUV $40,000
2017 Volvo XC60 T5 Dynamic SUV $26,500
2020 Volvo XC60 T5 Inscription SUV $47,000
2019 Volvo XC60 T5 Inscription SUV $45,500
2021 Volvo XC60 T5 Momentum SUV $45,000
2019 Volvo XC60 T5 Momentum SUV $43,591
2019 Volvo XC60 T5 Momentum SUV $43,500
2015 Volvo XC60 T6 SUV $0
2021 Volvo XC60 T6 Inscription SUV $56,200
2019 Volvo XC60 T6 Inscription SUV $46,500
2021 Volvo XC60 T6 Momentum SUV $45,000
2021 Volvo XC60 T6 Momentum SUV $48,000
2021 Volvo XC60 T6 Momentum SUV $47,688
2019 Volvo XC60 T6 R-Design SUV $52,000
2021 Volvo XC90 T5 Momentum SUV $59,275
2019 Volvo XC90 T5 R-Design SUV $55,482
2019 Volvo XC90 T6 Inscription SUV $58,688
2021 Volvo XC90 T6 Momentum SUV $63,295
2021 Volvo XC90 T6 Momentum SUV $56,000
2021 Volvo XC90 T6 Momentum SUV $57,000
2020 Volvo XC90 T6 Momentum SUV $58,000
2019 Volvo XC90 T6 Momentum SUV $53,104
2019 Volvo XC90 T6 Momentum SUV $51,359
2019 Volvo XC90 T6 Momentum SUV $52,000
2019 Volvo XC90 T6 Momentum SUV $55,000
2019 Volvo XC90 T6 R-Design SUV $60,862
2019 Volvo XC90 T6 R-Design SUV $54,000
2019 Volvo XC90 Hybrid T8 Inscription SUV $66,300
Upvotes: 1
Reputation: 4551
Fundamental mis-understanding of web scraping.
The page ../overland-park-ks.htm
does not contain any dom elements with class="vehicle-card-details-container"
. Instead, the page loads javascript which eventually loads a JSON file called getInventory
, which contains the data you're probably interested in.
To properly web scrape, you have to understand how the browser loads multiple pages to construct what you see. Or use something like Selenium which does that work first.
Upvotes: 0