Reputation: 497
I would like to automatically download text files for ATS Blocks Download section on FINRA website. The problem is while I am able to click on the icon and open the file in the browser, I cannot get the page source after the click. driver.page_source
returns the page source for the ATS Blocks Download section page (the one before the click).
Here is a piece of code I was trying out:
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
import time
driver = webdriver.Chrome(ChromeDriverManager().install())
URL = 'https://otctransparency.finra.org/otctransparency/'
driver.get(URL)
# Agree to the general terms
driver.find_element_by_xpath('//*[@class="btn btn-warning"]').click()
#go to ATS Blocks Download section
driver.find_element_by_xpath('//*[@href="/otctransparency/AtsBlocksDownload"]').click()
#wait for the page to fully load
time.sleep(5)
#click on each download icon
for element in driver.find_elements_by_xpath('//*[@src="./assets/icon_download.png"]'):
element.click()
print(driver.page_source)
How to get the page source after every element.click()
?
Upvotes: 0
Views: 847
Reputation: 2061
Be sure not to mix various "waiting" mechanisms, as it can result in unexpected behavior (See this StackOverflow post for "why").
Be careful when using setting an implicit wait time, since once it is set, it is set for the lifetime of the driver instance (source, although it has been said in various places across the web).
If you intend to have your driver wait on multiple pages, you should use WebDriverWait
. As shown in other replies, WebDriverWait(driver, timeout)
accepts a WebDriver
instance as well as an integer which represents the amount of time to wait before throwing an TimeoutException
, in other words it accepts a timeout.
You can create a new WebDriverWait
instance every time you're trying to find an element, without having to create a new WebDriver
instance with a new implicit wait time. Since each element may need to waited on for a differing duration, this is ideal. You could go as far as to create a wrapper function to encapsulate the use of WebDriverWait
:
def PatientlyClick(by, path, driver, timeout):
WebDriverWait(driver,timeout).until(EC.element_to_be_clickable((by, path))).click()
The above snippet of code could be made prettier if you designed a class which encapsulated your WebDriver
instance, but that might be unnecessary for your purposes (see Page Object Model Design Pattern).
Upvotes: 0
Reputation: 33384
To get page_source
of all the pages.
You need to
Induce WebDriverWait
and element_to_be_clickable
()
Induce WebDriverWait
and visibility_of_all_elements_located
()
Induce WebDriverWait
and number_of_windows_to_be
()
Code:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time
driver = webdriver.Chrome()
URL = 'https://otctransparency.finra.org/otctransparency/'
driver.get(URL)
driver.maximize_window()
# Agree to the general terms
WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//*[@class="btn btn-warning"]'))).click()
#go to ATS Blocks Download section
WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//a[@href="/otctransparency/AtsBlocksDownload"]'))).click()
#click on each download icon
elements=WebDriverWait(driver,10).until(EC.visibility_of_all_elements_located((By.XPATH,'//img[@src="./assets/icon_download.png"]')))
for link in range(len(elements)):
elements = WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.XPATH, '//img[@src="./assets/icon_download.png"]')))
elements[link].click()
WebDriverWait(driver,10).until(EC.number_of_windows_to_be(2))
windowhandles=driver.window_handles
driver.switch_to.window(windowhandles[-1])
WebDriverWait(driver,10).until(EC.visibility_of_element_located((By.TAG_NAME,"pre")))
print(driver.page_source)
driver.close()
driver.switch_to.window(windowhandles[0])
Upvotes: 1
Reputation: 114
After every time that you click element to download other browser tab is opened, in order to get the page source from the other tab use:
for element in driver.find_elements_by_xpath('//[@src="./assets/icon_download.png"]'):
element.click()
driver.switch_to.window(driver.window_handles[1])
driver.set_page_load_timeout(120)
print(driver.page_source)
driver.switch_to.window(driver.window_handles[0])
driver.set_page_load_timeout(120)
PS. Instead of doing the:
time.sleep(5)
You can do:
driver.set_page_load_timeout(120)
Upvotes: 0