TJ1
TJ1

Reputation: 8488

Python: using Selenium for multiple URLs without quitting driver

I am trying to use Selenium to scrape a number of URLs. Here is part of the code:

driver = webdriver.Chrome()
url = 'first URL'
driver.execute_script('''window.open("'''+str(url)+'''","_blank");''')
driver.switch_to_window(driver.window_handles[1])
time.sleep(3)
doc1 = html.fromstring(driver.page_source)

url = 'second URL'
driver.execute_script('''window.open("'''+str(url)+'''","_blank");''')
driver.switch_to_window(driver.window_handles[1])
time.sleep(3)
doc2 = html.fromstring(driver.page_source)

But what I see is that doc1 and doc2 are the same. Any idea why this happens?

I guess one way is to do driver.quit() after getting doc1 and then do everything again for second URL. But I don't want to quit the chrome. Is this possible?

Upvotes: 1

Views: 1910

Answers (3)

SanV
SanV

Reputation: 945

If you place all URLs in a list or tuple (e.g., "myURLs"), you could use the following approach:

from selenium import webdriver
myURLs = ["https://google.com", "https://bing.com", "https://duckduckgo.com"]
driver = [None] * len(myURLs)
# for info on enumerate(), see link below
for i, item in enumerate(myURLs):
    driver[i] = webdriver.Chrome()
    driver[i].get(item)

7 PEP 279: enumerate()

Upvotes: 0

Ali
Ali

Reputation: 1689

In your first driver.execute_script(), it will launch the browser with default window and then it will navigate to the provided URL in another window so you will have total of 2 windows and you are doing driver.switch_to_window() to switch to the second window and this is fine.

When it comes to the second driver.execute_script(), you will have two previous windows along with the new one so total 3 windows you will have. If you do the driver.window_handles[1] again then you will get the same page source so to avoid this, you need to change an index number to 2.

Try the below code :

driver = webdriver.Chrome()
url = 'first URL'
driver.execute_script('''window.open("'''+str(url)+'''","_blank");''')
driver.switch_to_window(driver.window_handles[1])
time.sleep(3)
doc1 = html.fromstring(driver.page_source)

url = 'second URL'
driver.execute_script('''window.open("'''+str(url)+'''","_blank");''')
driver.switch_to_window(driver.window_handles[2])
time.sleep(3)
doc2 = html.fromstring(driver.page_source)

To learn more about working with JavaScriptExecutor then refer This Link

I hope it helps...

Upvotes: 1

Pritam Maske
Pritam Maske

Reputation: 2760

Use "driver.get("url")" in place of driver.execute_script('''window.open("'''+str(url)+'''","_blank");''')

Upvotes: 2

Related Questions