user3102085
user3102085

Reputation: 499

How to get request headers in Selenium

https://www.sahibinden.com/en

If you open it incognito window and check headers in Fiddler then these are the two main headers you get: enter image description here

When I click the last one and check request headers this is what I get enter image description here

I want to get these headers in Python. Is there any way that I can get these using Selenium? Im a bit clueless here.

Upvotes: 15

Views: 77919

Answers (7)

MortenB
MortenB

Reputation: 3567

You can use https://pypi.org/project/selenium-wire/ a plug-in replacement for webdriver adding request/response manipulation even for https by using its own local ssl certificate.

from seleniumwire import webdriver
d = webdriver.Chrome() # make sure chrome/chromedriver is in path
d.get('https://en.wikipedia.org')
vars(d.requests[-1].headers)

will list the headers in the last requests object list:

{'policy': Compat32(), '_headers': [('content-length', '1361'), 
('content-type', 'application/json'), ('sec-fetch-site', 'none'), 
('sec-fetch-mode', 'no-cors'), ('sec-fetch-dest', 'empty'), 
('user-agent', 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.5112.102 Safari/537.36'), 
('accept-encoding', 'gzip, deflate, br')],
'_unixfrom': None, '_payload': None, '_charset': None,
'preamble': None, 'epilogue': None, 'defects': [], '_default_type': 'text/plain'}

Upvotes: 0

keyiflerolsun
keyiflerolsun

Reputation: 409

js_headers = '''
    const _xhr = new XMLHttpRequest();
    _xhr.open("HEAD", document.location, false);
    _xhr.send(null);

    const _headers = {};

    _xhr.getAllResponseHeaders().trim().split(/[\\r\\n]+/).map((value) => value.split(/: /)).forEach((keyValue) => {
        _headers[keyValue[0].trim()] = keyValue[1].trim();
    });

    return _headers;
'''

page_headers = driver.execute_script(js_headers)

type(page_headers) # -> dict

Upvotes: 0

Celso Jr
Celso Jr

Reputation: 295

Maybe you can use BrowserMob Proxy for this. Here is a example:

import settings

from browsermobproxy import Server
from selenium.webdriver import DesiredCapabilities

config = settings.Config

server = Server(config.BROWSERMOB_PATH)
server.start()
proxy = server.create_proxy()

from selenium import webdriver
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--proxy-server=%s' % proxy.proxy)
chrome_options.add_argument('--headless')

capabilities = DesiredCapabilities.CHROME.copy()
capabilities['acceptSslCerts'] = True
capabilities['acceptInsecureCerts'] = True

driver = webdriver.Chrome(options=chrome_options,
    desired_capabilities=capabilities,
   executable_path=config.CHROME_PATH)

proxy.new_har("sahibinden", options={'captureHeaders': True})
driver.get("https://www.sahibinden.com/en")

entries = proxy.har['log']["entries"]
for entry in entries:
    if 'request' in entry.keys():
        print(entry['request']['url'])
        print(entry['request']['headers'])
        print('\n')

proxy.close()
driver.quit()

Upvotes: 2

raifpy
raifpy

Reputation: 175

You can run JS command like this;

var req = new XMLHttpRequest()
req.open('GET', document.location, false)
req.send(null)
return req.getAllResponseHeaders()

On Python;

driver.get("https://t.me/codeksiyon")
headers = driver.execute_script("var req = new XMLHttpRequest();req.open('GET', document.location, false);req.send(null);return req.getAllResponseHeaders()")

# type(headers) == str

headers = headers.splitlines()

Upvotes: 13

Konemiees
Konemiees

Reputation: 319

You can use Selenium Wire. It is a Selenium extension which has been developed for this exact purpose.

https://pypi.org/project/selenium-wire/

An example after pip install:

##  Import webdriver from Selenium Wire instead of Selenium
from seleniumwire import webdriver

##  Get the URL
driver = webdriver.Chrome("my/path/to/driver", options=options)
driver.get("https://my.test.url.com")

##  Print request headers
for request in driver.requests:
  print(request.url) # <--------------- Request url
  print(request.headers) # <----------- Request headers
  print(request.response.headers) # <-- Response headers

Upvotes: 31

undetected Selenium
undetected Selenium

Reputation: 193308

The bottom line is, No, you can't retrieve the request headers using Selenium.


Details

It had been a long time demand from the Selenium users to add the WebDriver methods to read the HTTP status code and headers from a HTTP response. We have discussed about implementing this feature through Selenium at length within the discussion WebDriver lacks HTTP response header and status code methods.

However, Jason Leyba (Selenium contributor) in his comment straightly mentioned:

We will not be adding this feature to the WebDriver API as it falls outside of our current scope (emulating user actions).

Ashley Leyba further added, attempting to make WebDriver the ideal web testing tool will suffer in overall quality as driver.get(url) blocks until the browser has loaded the page and return the response for the final loaded page. So in case of a login redirects, status codes and headers will always end up with a 200 instead of the 302 you're looking for.

Finally, Simon M Stewart (WebDriver creator) in his comment concluded that:

This feature isn't going to happen. The recommended approach is to either extend the HtmlUnitDriver to access the information you require or to make use of an external proxy that exposes this information such as the BrowserMob Proxy

Upvotes: 9

Baris
Baris

Reputation: 415

It's not possible to get headers using Selenium. Further information

However, you might use other libraries such as requests, BeautifulSoup to get headers.

Upvotes: -1

Related Questions