Reputation: 549
I would like to retrieve information from multiple webpages by clicking through them (see image 1 & 2). The problem is a) there is not next button and b) even though the page link contains a number for counting through, it does not react to changing the numbering manually (i.e. does not load the next page). This makes the task tricky.
Can anyone help on how to solve this?
here is what the structure of the link looks like (no functioning page) https://sample.io/address/ID#pageSize=100
UPDATE: Got it to work with help of Robbie W.. The code I am using is:
options.add_argument('windows-size = 1200 x 800')
browser = webdriver.Chrome(chrome_options = options)
browser.get('URL')
page_soup_1 = soup(browser.page_source, "lxml")
items_1 = page_soup_1.find_all("li", {"class": "page-item" })
LenofPage = pd.DataFrame()
count = pd.DataFrame()
for item in items_1 :
string = str(item)
Num = string[string.find('page-item')+23:string.find('\/li')-8]
LenofPage = LenofPage.append({'LenofPage': Num}, ignore_index = True)
Max_pagenum = LenofPage.max()
Max_pagenum_1 = int(Max_pagenum)
count = 1
#items_1 = page_soup.find_all("li", {"class": "page-item active"
}).next_sibling
while count < Max_pagenum_1:
link = browser.find_element_by_xpath('//li[contains(@class, "page-item")
and contains(@class,"active")]/following-sibling::li/a')
link.click()
count = count + 1
time.sleep(3)
print(count)
Upvotes: 1
Views: 646
Reputation: 3428
This may need slightly amending as you reach the final few pages, but I would suggest using XPath to find the li
next to the currently selected li
, and then click the a
tag within it.
//li[contains(@class, "page-item") and contains(@class,"active")]/following-sibling::li/a
Upvotes: 1