some
some

Reputation: 363

reading cookies file created by curl

I have the following cookie saved by curl (in test.txt, tab-separated, this editor doesn't preserve tabs):

# Netscape HTTP Cookie File
# http://curlm.haxx.se/rfc/cookie_spec.html
# This file was generated by libcurl! Edit at your own risk.

#HttpOnly_my-example.com    FALSE   /   FALSE   0   _rails-root_session test

I'm trying to read it with the following code:

import sys

if sys.version_info < (3,):
    from cookielib import Cookie, MozillaCookieJar
else:
    from http.cookiejar import Cookie, MozillaCookieJar

def load_cookies_from_mozilla(filename):
    ns_cookiejar = MozillaCookieJar()
    ns_cookiejar.load(filename, ignore_discard=True)
    return ns_cookiejar

cookies = load_cookies_from_mozilla("test.txt")
print (len(cookies))

It outputs 0 (unable to read the cookie). If I manually modify my cookie to the following line (remove HttpOnly flag and changing 0 to the empty string for expiration time, and again, tab-separated):

my-example.com  FALSE   /   FALSE       _rails-root_session test

then it outputs 1 (successfully read the cookie).

What needs to be done to my python code to read the original cookie line? And preferably to be able to save it in the same format (with HttpOnly flag and with 0 instead of empty string for never-expiring cookie)?

Thanks.

Upvotes: 8

Views: 6701

Answers (2)

Mitch
Mitch

Reputation: 1554

This appears to be an open bug: https://bugs.python.org/issue2190.

This bug report contains a link to a workaround: https://gerrit.googlesource.com/git-repo/+/master/subcmds/sync.py#995

In that linked code, the developer creates a temporary cookies file, removes the "#HttpOnly_" prefixes, and then creates a cookiejar with that temporary file.

tmpcookiefile = tempfile.NamedTemporaryFile()
tmpcookiefile.write("# HTTP Cookie File")
try:
  with open(cookiefile) as f:
    for line in f:
      if line.startswith("#HttpOnly_"):
       line = line[len("#HttpOnly_"):]
      tmpcookiefile.write(line)
  tmpcookiefile.flush()
  cookiejar = cookielib.MozillaCookieJar(tmpcookiefile.name)
  try:
    cookiejar.load()
  except cookielib.LoadError:
    cookiejar = cookielib.CookieJar()
finally:
  tmpcookiefile.close()

Upvotes: 5

Fisher
Fisher

Reputation: 21

I tested your code and modified it, it works. First in the cookie file you have to put off the '#' before your cookie, I think it will comment the data after it. Second the 0 in the cookie means the expire time, 0 means expire now, so you can change the 0 to empty string or latter time, but i suggest you use the argument ignore_expire=True, the official means:

ignore_discard: save even cookies set to be discarded.

ignore_expires: save even cookies that have expiredThe file is overwritten if it already exists

and the result code is :

import sys
if sys.version_info < (3,):
    from cookielib import Cookie, MozillaCookieJar
else:
    from http.cookiejar import Cookie, MozillaCookieJar

def load_cookies_from_mozilla(filename):
    ns_cookiejar = MozillaCookieJar()
    ns_cookiejar.load(filename, ignore_discard=True, ignore_expires=True)
    return ns_cookiejar

cookies = load_cookies_from_mozilla("test.txt")
print (len(cookies))

and you can see the link to find more detail: Using cookies.txt file with Python Requests

Upvotes: 2

Related Questions