H.B.
H.B.

Reputation: 63

All Cookies from Firefox over Selenium

For our privacy policy we would like to write a crawler that automatically lists all 3rd party connections as well as all cookies. These should run daily and be compared with the existing ones.

For the implementation I used python3 and Selenium with Firefox.

  1. The function get_cookies() returns only cookies that are intended for the current domain. -> Does not work correctly.
  2. Completely crawling the website and then calling all domains again does not return all cookies. -> Does therefore not work correctly.
  3. An SQLite database is created in the Firefox profile folder (cookies.sqlite), which contains all cookies. However, Firefox over Selenium does not change this file. I have already made several settings in Firefox. When I start Firefox without Selenium, this file is modified. But if I use Selenium, it will not. My code looks like this:
firefox_profile = webdriver.FirefoxProfile("/home/user/.mozilla/firefox/rkggssrl.SeleniumTest")
browser = webdriver.Firefox(firefox_profile, executable_path=r'./geckodriver')
browser.get('https://www.lenovo.com/') # or any other side
time.sleep(20)

I've already tried Selenium test runs won't save cookies?.

Questions:

  1. How do I get all cookies from Firefox over Selenium?
  2. How does Firefox modify the cookies.sqlite over Selenium?
  3. EDIT: I found out that Selenium creates a temporary folder under /tmp on every startup. This is also ok. But how do I get the correct folder programmatically from Selenium?

Upvotes: 1

Views: 1653

Answers (1)

H.B.
H.B.

Reputation: 63

The solution to question 3 is relatively simple, which automatically answers all other questions as well. I'll leave it as it is. Selenoum creates a new profile at every start. After that I can read the cookies.sqlite file and get all cookies.

How I figured it out:

  1. By using the dir() function (https://docs.python.org/3/library/functions.html#dir) I got the following.
print(dir(browser))
['CONTEXT_CHROME', 'CONTEXT_CONTENT', 'NATIVE_EVENTS_ALLOWED', '__class__', '__delattr__', 
'__dict__', '__dir__', '__doc__', '__enter__', '__eq__', '__exit__', '__format__', 
'__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', 
'__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', 
'__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', 
'_file_detector', '_is_remote', '_mobile', '_switch_to', '_unwrap_value', 
'_web_element_cls', '_wrap_value', 'add_cookie', 'application_cache', 'back', 'binary', 
'capabilities', 'close', 'command_executor', 'context', 'create_web_element', 
'current_url', 'current_window_handle', 'delete_all_cookies', 'delete_cookie', 
'desired_capabilities', 'error_handler', 'execute', 'execute_async_script', 
'execute_script', 'file_detector', 'file_detector_context', 'find_element', 
'find_element_by_class_name', 'find_element_by_css_selector', 'find_element_by_id', 
'find_element_by_link_text', 'find_element_by_name', 'find_element_by_partial_link_text', 
'find_element_by_tag_name', 'find_element_by_xpath', 'find_elements', 
'find_elements_by_class_name', 'find_elements_by_css_selector', 'find_elements_by_id', 
'find_elements_by_link_text', 'find_elements_by_name', 
'find_elements_by_partial_link_text', 'find_elements_by_tag_name', 
'find_elements_by_xpath', 'firefox_profile', 'forward', 'fullscreen_window', 'get', 
'get_cookie', 'get_cookies', 'get_log', 'get_screenshot_as_base64', 
'get_screenshot_as_file', 'get_screenshot_as_png', 'get_window_position', 
'get_window_rect', 'get_window_size', 'implicitly_wait', 'install_addon', 'log_types',
 'maximize_window', 'minimize_window', 'mobile', 'name', 'orientation', 'page_source', 
'profile', 'quit', 'refresh', 'save_screenshot', 'service', 'session_id', 'set_context', 
'set_page_load_timeout', 'set_script_timeout', 'set_window_position', 'set_window_rect', 
'set_window_size', 'start_client', 'start_session', 'stop_client', 'switch_to', 
'switch_to_active_element', 'switch_to_alert', 'switch_to_default_content', 
'switch_to_frame', 'switch_to_window', 'title', 'uninstall_addon', 'w3c', 
'window_handles']
  1. After that the capabilities element appeared.
print(browser.capabilities)
{'acceptInsecureCerts': True, 'browserName': 'firefox', 'browserVersion': '76.0.1', 
'moz:accessibilityChecks': False, 'moz:buildID': '20200507114007', 
'moz:geckodriverVersion': '0.26.0', 'moz:headless': False, 'moz:processID': 110409, 
'moz:profile': '/tmp/rust_mozprofilel3IJsK', 'moz:shutdownTimeout': 60000, 
'moz:useNonSpecCompliantPointerOrigin': False, 'moz:webdriverClick': True, 
'pageLoadStrategy': 'normal', 'platformName': 'linux',
'platformVersion': '5.3.0-1020-azure', 'rotatable': False, 'setWindowRect': True, 
'strictFileInteractability': False, 'timeouts': {'implicit': 0, 'pageLoad': 300000, 
'script': 30000}, 'unhandledPromptBehavior': 'dismiss and notify'}

Now the temporary profile can be determined with the element 'moz:profile'. Before closing Selenium the following is now executed:

# Save cookie file
outputDest = "yourOutputDestination"
capability = browser.capabilities
browserDir = capability['moz:profile']
cookieFile = os.path.join(browserDir, 'cookies.sqlite')
os.rename(cookieFile, outputDest)
# Close selenium
if browser is not None:
    browser.close()
    browser.quit()
    browser = None

Upvotes: 3

Related Questions