no nein
no nein

Reputation: 711

Error when loading cookies into a Python request session

I am trying to load cookies into my request session in Python from selenium exported cookies, however when I do it returns the following error: "'list' object has no attribute 'extract_cookies'"

def load_cookies(filename):
    with open(filename, 'rb') as f:
        return pickle.load(f)

initial_state= requests.Session()
initial_state.cookies=load_cookies(time_cookie_file)
search_requests = initial_state.get(search_url)

Everywhere I see this should work, however my cookies are a list of dictionaries, which is what I understand all cookies are, and why I assume this works with Selenium. However for some reason it does not work with requests, any and all help in this regard would be really great, it feels like I am missing something obvious!

Cookies have been dumped from Selenium using:

with open("Filepath.pkl", 'wb') as f:
    pickle.dump(driver.get_cookies(), f)

An example of the cookies would be (slightly obfuscated):

[{'domain': '.website.com',
  'expiry': 1640787949,
  'httpOnly': False,
  'name': '_ga',
  'path': '/',
  'secure': False,
  'value': 'GA1.2.1111111111.1111111111'},
 {'domain': 'website.com',
  'expiry': 1585488346,
  'httpOnly': False,
  'name': '__pnahc',
  'path': '/',
  'secure': False,
  'value': '0'}]

I have now managed to load in the cookies as per the answer below, however it does not seem like the cookies are loaded in properly as they do not remember anything, however if I load the cookies in when browsing through Selenium they work fine.

Upvotes: 5

Views: 13270

Answers (6)

undetected Selenium
undetected Selenium

Reputation: 193058

Cookie

The Cookie HTTP request header contains stored HTTP cookie previously sent by the server with the Set-Cookie header. A HTTP cookie is a small piece of data that a server sends to the user's web browser. The browser may store the cookies and send it back with the next request to the same server. Typically, to tell if two requests came from the same browser, keeping the user logged in.


Demonstration using Selenium

To demonstrate the usage of cookies using Selenium we have stored the cookies using once the user had logged into the website http://demo.guru99.com/test/cookie/selenium_aut.php. In the next step, we opened the same website, adding the cookies and was able to land as a logged in user.

  • Code Block to store the cookies:

    from selenium import webdriver
    import pickle
    
    options = webdriver.ChromeOptions() 
    options.add_argument("start-maximized")
    options.add_experimental_option("excludeSwitches", ["enable-automation"])
    options.add_experimental_option('useAutomationExtension', False)
    driver = webdriver.Chrome(options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
    driver.get('http://demo.guru99.com/test/cookie/selenium_aut.php')
    driver.find_element_by_name("username").send_keys("abc123")
    driver.find_element_by_name("password").send_keys("123xyz")
    driver.find_element_by_name("submit").click()
    pickle.dump( driver.get_cookies() , open("cookies.pkl","wb"))
    
  • Code Block to use the stored cookies for automatic authentication:

    from selenium import webdriver
    import pickle
    
    options = webdriver.ChromeOptions() 
    options.add_argument("start-maximized")
    options.add_experimental_option("excludeSwitches", ["enable-automation"])
    options.add_experimental_option('useAutomationExtension', False)
    driver = webdriver.Chrome(options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
    driver.get('http://demo.guru99.com/test/cookie/selenium_aut.php')
    cookies = pickle.load(open("cookies.pkl", "rb"))
    for cookie in cookies:
        driver.add_cookie(cookie)
    driver.get('http://demo.guru99.com/test/cookie/selenium_cookie.php')
    

Demonstration using Requests

To demonstrate usage of cookies using and we have accessed the site https://www.google.com, added a new dictionary of cookies:

{'name':'my_own_cookie','value': 'debanjan' ,'domain':'.stackoverflow.com'}

Next, we have used the same requests session to send another request which was successful as follows:

  • Code Block:

    import requests
    
    s1 = requests.session()
    s1.get('https://www.google.com')
    print("Original Cookies")
    print(s1.cookies)
    print("==========")
    cookie = {'name':'my_own_cookie','value': 'debanjan' ,'domain':'.stackoverflow.com'}
    s1.cookies.update(cookie)
    print("After new Cookie added")
    print(s1.cookies)
    
  • Console Output:

    Original Cookies
    <RequestsCookieJar[<Cookie 1P_JAR=2020-01-21-14 for .google.com/>, <Cookie NID=196=NvZMMRzKeV6VI1xEqjgbzJ4r_3WCeWWjitKhllxwXUwQcXZHIMRNz_BPo6ujQduYCJMOJgChTQmXSs6yKX7lxcfusbrBMVBN_qLxLIEah5iSBlkdBxotbwfaFHMd-z5E540x02-YZtCm-rAIx-MRCJeFGK2E_EKdZaxTw-StRYg for .google.com/>]>
    ==========
    After new Cookie added
    <RequestsCookieJar[<Cookie domain=.stackoverflow.com for />, <Cookie name=my_own_cookie for />, <Cookie value=debanjan for />, <Cookie 1P_JAR=2020-01-21-14 for .google.com/>, <Cookie NID=196=NvZMMRzKeV6VI1xEqjgbzJ4r_3WCeWWjitKhllxwXUwQcXZHIMRNz_BPo6ujQduYCJMOJgChTQmXSs6yKX7lxcfusbrBMVBN_qLxLIEah5iSBlkdBxotbwfaFHMd-z5E540x02-YZtCm-rAIx-MRCJeFGK2E_EKdZaxTw-StRYg for .google.com/>]>
    

Conclusion

Clearly, the newly added dictionary of cookies {'name':'my_own_cookie','value': 'debanjan' ,'domain':'.stackoverflow.com'} is pretty much in use within the second request.


Passing Selenium Cookies to Python Requests

Now, if your usecase is to passing Selenium Cookies to Python Requests, you can use the following solution:

from selenium import webdriver
import pickle
import requests

options = webdriver.ChromeOptions() 
options.add_argument("start-maximized")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
driver.get('http://demo.guru99.com/test/cookie/selenium_aut.php')
driver.find_element_by_name("username").send_keys("abc123")
driver.find_element_by_name("password").send_keys("123xyz")
driver.find_element_by_name("submit").click()

# Storing cookies through Selenium
pickle.dump( driver.get_cookies() , open("cookies.pkl","wb"))
driver.quit()

# Passing cookies to Session
session = requests.session()  # or an existing session
with open('cookies.pkl', 'rb') as f:
    session.cookies.update(pickle.load(f))
search_requests = session.get('https://www.google.com/')
print(session.cookies)

Upvotes: 6

pbacterio
pbacterio

Reputation: 1152

Requests also accept http.cookiejar.CookieJar objects:

https://docs.python.org/3.8/library/http.cookiejar.html#cookiejar-and-filecookiejar-objects

Upvotes: 0

Jortega
Jortega

Reputation: 3790

So requests wants all "values" in your cookie to be a string. Possibly the same with the "key". Cookies also does not want a list as your function load_cookies returns. Cookies can be created for the request.utils with cookies = requests.utils.cookiejar_from_dict(....

Lets say I go to "https://stackoverflow.com/" with selenium and save the cookies as you have done.

from selenium import webdriver
import pickle
import requests

#Go to the website
driver = webdriver.Chrome(executable_path=r'C:\Path\\To\\Your\\chromedriver.exe')
driver.get('https://stackoverflow.com/')

#Save the cookies in a file
with open("C:\Path\To\Your\Filepath.pkl", 'wb') as f:
    pickle.dump(driver.get_cookies(), f)

driver.quit()
#you function to get the cookies from the file.
def load_cookies(filename):
  with open(filename, 'rb') as f:
    return pickle.load(f)

saved_cookies_list = load_cookies("C:\Path\To\Your\Filepath.pkl")

#Set request session
initial_state = requests.Session()
#Function to fix cookie values and add cookies to request_session
def fix_cookies_and_load_to_requests(cookie_list, request_session):
    for index in range(len(cookie_list)):
        for item in cookie_list[index]:
            if type(cookie_list[index][item]) != str:
                print("Fix cookie value: ", cookie_list[index][item])
                cookie_list[index][item] = str(cookie_list[index][item])
        cookies = requests.utils.cookiejar_from_dict(cookie_list[index])
        request_session.cookies.update(cookies)
    return request_session

initial_state_with_cookies = fix_cookies_and_load_to_requests(cookie_list=saved_cookies_list, request_session=initial_state)

search_requests = initial_state_with_cookies.get("https://stackoverflow.com/")
print("search_requests:", search_requests)

Upvotes: 0

Seema Nair
Seema Nair

Reputation: 165

You can create a session. The session class handles cookies between requests.

s = requests.Session()


login_resp = s.post('https://example.com/login', login_data)
self.cookies = self.login_resp.cookies



cookiedictreceived = {}
cookiedictreceived=requests.utils.dict_from_cookiejar(self.login_resp.cookies)

Upvotes: 0

Sers
Sers

Reputation: 12255

You can get cookies and use only name/value. You'll need headers also. You can get them from dev tools or by using proxy.

Basic example:

driver.get('https://website.com/')

# ... login or do anything

cookies = {}
for cookie in driver.get_cookies():
    cookies[cookie['name']] = cookie['value']
# Write to a file if need or do something
# import json
# with open("cookies.txt", 'w') as f:
#    f.write(json.dumps(cookies))

And usage:

# Read cookies from file as Dict
# with open('cookies.txt') as reader:
#     cookies = json.loads(reader.read())

# use cookies
response = requests.get('https://website.com/', headers=headers, cookies=cookies)

Stackoverflow headers example, some headers can be required some not. You can find information here and here. You can get request headers using dev tools Network tab:

headers = {
    'authority': 'stackoverflow.com',
    'pragma': 'no-cache',
    'cache-control': 'no-cache',
    'dnt': '1',
    'upgrade-insecure-requests': '1',
    'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.117 Safari/537.36',
    'sec-fetch-user': '?1',
    'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
    'sec-fetch-site': 'same-origin',
    'sec-fetch-mode': 'navigate',
    'referer': 'https://stackoverflow.com/questions/tagged?sort=Newest&tagMode=Watched&uqlId=8338',
    'accept-encoding': 'gzip, deflate, br',
    'accept-language': 'ru,en-US;q=0.9,en;q=0.8,tr;q=0.7',
}

Upvotes: 0

zamir
zamir

Reputation: 2522

Since you are replacing session.cookies (RequestsCookieJar) with a list which don't have those attributes, it won't work.

You can import those cookies one by one by using:

for c in your_cookies_list:
   initial_state.cookies.set(name=c['name'], value=c['value'])

I've tried loading the whole cookie but it seems like requests doesn't recognize those ones and returns:

TypeError: create_cookie() got unexpected keyword arguments: ['expiry', 'httpOnly']

requests accepts expires instead and HttpOnly comes nested within rest

Update:

We can also change the dict keys for expiry and httpOnly so that requests correctly load them instead of throwing an exception, by using dict.pop() which deletes an item from dict by the key and returns the value of deleted key so after we add a new key with deleted item value then unpack & pass them as kwargs:

for c in your_cookies_list:
    c['expires'] = c.pop('expiry')
    c['rest'] = {'HttpOnly': c.pop('httpOnly')}
    initial_state.cookies.set(**c)

Upvotes: 3

Related Questions