Reputation: 22440
I've written a script in python with selenium to handle the infinite scrolling webpage. The problem I'm facing is that It scrolls few times then quits the browser. It never reaches the bottom. I tried with Explicit Wait
as well but that gives even fewer scrolling. How can I reach the bottom when there will be no more scrolling to do.
This is my try:
import time
from selenium import webdriver
from urllib.parse import urljoin
url = "https://www.instagram.com/explore/tags/travelphotoawards/"
driver = webdriver.Chrome()
driver.get(url)
last_len = len(driver.find_elements_by_css_selector(".v1Nh3 a"))
new_len = last_len
while True:
last_len = new_len
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
time.sleep(5)
items = driver.find_elements_by_css_selector(".v1Nh3 a")
new_len = len(items)
if last_len == new_len:break
driver.quit()
Edit:
If I try like below, I can do the scrolling as many times as I want but that is not a good idea to cope with:
import time
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
url = "https://www.instagram.com/explore/tags/travelphotoawards/"
driver = webdriver.Chrome()
driver.get(url)
for scroll in range(1,10): #I can do the scrolling as many times as I want but it is fully hardcoded
item = driver.find_element_by_tag_name("body")
item.send_keys(Keys.END)
elems = driver.find_elements_by_css_selector(".v1Nh3 a")
time.sleep(3)
driver.quit()
I hope there is any way to do the scrolling automatically until it reaches the bottom.
Upvotes: 0
Views: 410
Reputation: 146520
So few thing here. In case of infinite scrolling I would follow few things
Below is a updated script which will do better for you. Do remember nothing is perfect, so you need to make your script adapt to failures
import time
from selenium import webdriver
from urllib.parse import urljoin
option = webdriver.ChromeOptions()
chrome_prefs = {}
option.experimental_options["prefs"] = chrome_prefs
chrome_prefs["profile.default_content_settings"] = {"images": 2}
chrome_prefs["profile.managed_default_content_settings"] = {"images": 2}
driver = webdriver.Chrome(chrome_options=option)
url = "https://www.instagram.com/explore/tags/travelphotoawards/"
driver.get(url)
last_len = len(driver.find_elements_by_css_selector(".v1Nh3 a"))
new_len = last_len
consistent = 0
while True:
last_len = new_len
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
time.sleep(5)
items = driver.find_elements_by_css_selector(".v1Nh3 a")
new_len = len(items)
if last_len == new_len:
consistent += 1
if consistent == 3:
break
else:
consistent = 0
driver.quit()
Upvotes: 3
Reputation: 50854
Every time there is a scroll older images disappear. You might get the same number or even smaller number of images after the scroll.
Each image has unique href
, you can compare the last image href
to the previous last image
last_href = driver.find_elements_by_css_selector('.v1Nh3 > a')[-1].get_attribute('href')
new_href = last_href
while True:
last_href = new_href
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
time.sleep(5)
new_href = driver.find_elements_by_css_selector('.v1Nh3 > a')[-1].get_attribute('href')
if last_href != new_href:
break
Upvotes: 2