Uma Prajapati
Uma Prajapati

Reputation: 29

How to scrape all results from list instead of be limited to only 20?

I'm using beautifulsoup and trying to scrape some cars24.com data. The list, however, only contains 20 cars details. That's weird, since the page contains a lot more car details (I tried saving it). What am I doing wrong and how can I get it to scrape the whole page?

This is my code:

from bs4 import BeautifulSoup as bs
import requests
link = 'https://www.cars24.com/buy-used-car?sort=P&storeCityId=2&pinId=110001'
page=requests.get(link)
soup = bs(page.content,'html.parser')
car_name = soup.find_all('h2',class_='_3FpCg') 
cust_name = []
for i in range(0, len(car_name)):
    cust_name.append(car_name[i].get_text())
cust_name

Is there a workaround for this? Appreciate the help.

Upvotes: 0

Views: 188

Answers (1)

baduker
baduker

Reputation: 20052

Use the API endpoint.

For example:

import requests

url = "https://api-sell24.cars24.team/buy-used-car?sort=P&serveWarrantyCount=true&gaId=&page=1&storeCityId=2&pinId=110001"
cars = requests.get(url).json()['data']['content']
base = "https://www.cars24.com/buy-used-"

for car in cars:
    car_name = "-".join(car['carName'].lower().split())
    car_city = "-".join(car['city'].lower().split())
    offer = f"{base}{car_name}-{car['year']}-cars-{car_city}-{car['carId']}"
    print(f"{car['carName']} - {car['year']} - {car['price']}")
    print(offer)

Output:

Maruti Swift Dzire - 2010 - 256299
https://www.cars24.com/buy-used-maruti-swift-dzire-2010-cars-new-delhi-10084891724
Hyundai Grand i10 - 2018 - 526599
https://www.cars24.com/buy-used-hyundai-grand-i10-2018-cars-noida-10572294761
Datsun Redi Go - 2018 - 234499
https://www.cars24.com/buy-used-datsun-redi-go-2018-cars-gurgaon-11073694705
Maruti Swift - 2020 - 566499
https://www.cars24.com/buy-used-maruti-swift-2020-cars-faridabad-11041770770
Hyundai i10 - 2009 - 170699
https://www.cars24.com/buy-used-hyundai-i10-2009-cars-rohtak-1007315463
Maruti Swift - 2020 - 577399
https://www.cars24.com/buy-used-maruti-swift-2020-cars-new-delhi-10065678773
Hyundai Grand i10 - 2018 - 508799
https://www.cars24.com/buy-used-hyundai-grand-i10-2018-cars-ghaziabad-11261195767
Maruti Swift - 2020 - 587599
https://www.cars24.com/buy-used-maruti-swift-2020-cars-new-delhi-10016194709
Maruti Swift - 2020 - 524099
https://www.cars24.com/buy-used-maruti-swift-2020-cars-new-delhi-10010390743
Hyundai AURA - 2021 - 675099
https://www.cars24.com/buy-used-hyundai-aura-2021-cars-faridabad-11095494760
Maruti Swift - 2019 - 541899
https://www.cars24.com/buy-used-maruti-swift-2019-cars-new-delhi-10016570794
Hyundai Grand i10 - 2019 - 490449
https://www.cars24.com/buy-used-hyundai-grand-i10-2019-cars-noida-10532691707
Hyundai Santro Xing - 2013 - 281999
https://www.cars24.com/buy-used-hyundai-santro-xing-2013-cars-gurgaon-10168291760
Hyundai Santro Xing - 2014 - 272099
https://www.cars24.com/buy-used-hyundai-santro-xing-2014-cars-gurgaon-10121974770
Mercedes Benz C Class - 2014 - 1854499
https://www.cars24.com/buy-used-mercedes-benz-c-class-2014-cars-new-delhi-1050064264
KIA CARENS - 2022 - 1608099
https://www.cars24.com/buy-used-kia-carens-2022-cars-gurgaon-10160777793
Tata ALTROZ - 2021 - 711599
https://www.cars24.com/buy-used-tata-altroz-2021-cars-new-delhi-10083196703
Maruti New  Wagon-R - 2020 - 508899
https://www.cars24.com/buy-used-maruti-new-wagon-r-2020-cars-new-delhi-10084875775
Hyundai Grand i10 - 2018 - 509099
https://www.cars24.com/buy-used-hyundai-grand-i10-2018-cars-new-delhi-10011277773
Maruti Wagon R 1.0 - 2011 - 282499
https://www.cars24.com/buy-used-maruti-wagon-r-1.0-2011-cars-new-delhi-10080499706

Note: You can paginate the API by incrementing the value of page in the URL.

For example:

import requests
import pandas as pd

base = "https://www.cars24.com/buy-used-"

table = []
with requests.Session() as s:
    for page in range(1, 11):
        url = f"https://api-sell24.cars24.team/buy-used-car?sort=P&serveWarrantyCount=true&gaId=&page={page}&storeCityId=2&pinId=110001"
        cars = s.get(url).json()['data']['content']
        print(f"Getting page {page}...")
        for car in cars:
            car_name = "-".join(car['carName'].lower().split())
            car_city = "-".join(car['city'].lower().split())
            offer_url = f"{base}{car_name}-{car['year']}-cars-{car_city}-{car['carId']}"
            table.append([car['carName'], car['year'], car['price'], offer_url])

df = pd.DataFrame(table, columns=['Car Name', 'Year', 'Price', 'Offer URL'])
df.to_csv('cars.csv', index=False)

Output: a .csv file:

enter image description here

Upvotes: 1

Related Questions