Reputation: 1
just started learning python. I'm trying to scrape all phone number from a paginated web site. but my code not go to paginate link and only looping on a same page. need advice here.
from bs4 import BeautifulSoup
import requests
for i in range(5000):
url = "http://www.mobil123.com/mobil?type=used&page_number=1".format(i)
r = requests.get(url)
soup = BeautifulSoup(r.content)
for record in soup.findAll('div', {"class": "card-contact-wrap"}):
for data in soup.findAll('div', {"data-get-content": "#whatsapp"}):
print(record.find('li').text)
print(data.text)
Upvotes: 0
Views: 2579
Reputation: 180411
As already pointed out you are missing the actual format placeholder, if you want all the pages you can scrape the number of pages from the initial page and loop in that range instead of trying to hard code the number of pages, it is on the second last li:
import requests
def get_pages(url):
soup = BeautifulSoup(requests.get(url).content,"lxml")
yield soup
url += "{}"
for n in range(2, int(soup.select("#js-listings-pagination li")[-2].text) + 1):
yield BeautifulSoup(requests.get(url.format(n)).content)
start = "http://www.mobil123.com/mobil?type=used"
for soup in get_pages(start):
print(soup)
Upvotes: 1
Reputation: 1563
You missed placing string formatter. Change url = "...." to
url = "http://www.mobil123.com/mobil?type=used&page_number={0}".format(i)
Upvotes: 1