Reputation: 129
I'm trying to generate a list of URLs with Selenium. I would like the user to navigate through the instrumented browser and finally create a list of URL that he visited.
I found that the property "current_url" could help to do that but I didn't find a way to know that the user clicked on a link.
In [117]: from selenium import webdriver
In [118]: browser = webdriver.Chrome()
In [119]: browser.get("http://stackoverflow.com")
--> here, I click on the "Questions" link.
In [120]: browser.current_url
Out[120]: 'http://stackoverflow.com/questions'
--> here, I click on the "Jobs" link.
In [121]: browser.current_url
Out[121]: 'http://stackoverflow.com/jobs?med=site-ui&ref=jobs-tab'
Any hint appreciated !
Thank you,
Upvotes: 2
Views: 947
Reputation: 2198
There isn't really an official way to monitor what a user is doing in Selenium. The only thing you can really do is start the driver, then run a loop that is constantly checking the driver.current_url
. However, I don't know what the best way to exit this loop is since i don't know what your usage is. Maybe try something like:
from selenium import webdriver
urls = []
driver = webdriver.Firefox()
current = 'http://www.google.com'
driver.get('http://www.google.com')
while True:
if driver.current_url != current:
current = driver.current_url
# if you want to capture every URL, including duplicates:
urls.append(current)
# or if you only want to capture unique URLs:
if current not in urls:
urls.append(current)
If you don't have any idea on how to end this loop, i'd suggest either the user navigating to a url that will break the loop, such as http://www.endseleniumcheck.com
and add it into the code as such:
from selenium import webdriver
urls = []
driver = webdriver.Firefox()
current = 'http://www.google.com'
driver.get('http://www.google.com')
while True:
if driver.current_url == 'http://www.endseleniumcheck.com':
break
if driver.current_url != current:
current = driver.current_url
# if you want to capture every URL, including duplicates:
urls.append(current)
# or if you only want to capture unique URLs:
if current not in urls:
urls.append(current)
Or, if you want to get crafty, you can terminate the loop when the user exit's the browser. You can do this by monitoring the Process ID with the psutil
library (pip install psutil
):
from selenium import webdriver
import psutil
urls = []
driver = webdriver.Firefox()
pid = driver.binary.process.pid
current = 'http://www.google.com'
driver.get('http://www.google.com')
while True:
if pid not in psutil.pids():
break
if driver.current_url != current:
current = driver.current_url
# if you want to capture every URL, including duplicates:
urls.append(current)
# or if you only want to capture unique URLs:
if current not in urls:
urls.append(current)
Upvotes: 2