abhi krishnan
abhi krishnan

Reputation: 836

bs4 getting only first 15 values

Hi am new to Bs4 i need to get all the product from a site i tried to get the values but it shows only first 15 from a 100 product in a page

from bs4 import BeautifulSoup
import requests
base_url = 'http://www.elkay.com/sinks/undermount#q=|100|0|1|'
response = requests.get(base_url)
soup = BeautifulSoup(response.content.decode('utf-8'), "html.parser")

is_row = soup.findAll('div', attrs={'class': 'product result_detail'})
print(is_row)

Can any One help me ?

here is_row is a ResultSet with len-15 actually their is 100 products

Any Help would be appriciated. Thanks

Upvotes: 1

Views: 127

Answers (1)

farkas
farkas

Reputation: 307

If you check that URL with a 'not-that-fast' connection (like mine :D) you can see that it only loads 15 items, AFTER it is fully loaded it sends ANOTHER request to load the rest of the items. This is why you only get the first 15 items using your code (you only get the response for the first request)

Using Chrome Developer Tools (press F12) you can easily find the right request to make:

  1. Open the site
  2. Open developer tools (F12)
  3. Click on the Network tab
  4. Now select Results per page: 100
  5. You should see a new request called CategoryNavigationResultsView
  6. Copy the curl command (Copy as cURL (bash))

Copy curl command

  1. Use this very handy site to convert the curl request into python requests

I won't copy the full request but it has a data param:

data = {
    'contentBeginIndex': '0^',
    'productBeginIndex': '0^',
    'showPageSize': '100^',
}

Sending the request like this should get you all 100 items. You can also get the following pages by changing the ...BeginIndex.

Upvotes: 2

Related Questions