morpheous
morpheous

Reputation: 16996

Python form POST using urllib2 (also question on saving/using cookies)

I am trying to write a function to post form data and save returned cookie info in a file so that the next time the page is visited, the cookie information is sent to the server (i.e. normal browser behavior).

I wrote this relatively easily in C++ using curlib, but have spent almost an entire day trying to write this in Python, using urllib2 - and still no success.

This is what I have so far:

import urllib, urllib2
import logging

# the path and filename to save your cookies in
COOKIEFILE = 'cookies.lwp'

cj = None
ClientCookie = None
cookielib = None


logger = logging.getLogger(__name__)

# Let's see if cookielib is available
try:
    import cookielib
except ImportError:
    logger.debug('importing cookielib failed. Trying ClientCookie')
    try:
        import ClientCookie
    except ImportError:
        logger.debug('ClientCookie isn\'t available either')
        urlopen = urllib2.urlopen
        Request = urllib2.Request
    else:
        logger.debug('imported ClientCookie succesfully')
        urlopen = ClientCookie.urlopen
        Request = ClientCookie.Request
        cj = ClientCookie.LWPCookieJar()

else:
    logger.debug('Successfully imported cookielib')
    urlopen = urllib2.urlopen
    Request = urllib2.Request

    # This is a subclass of FileCookieJar
    # that has useful load and save methods
    cj = cookielib.LWPCookieJar()


login_params = {'name': 'anon', 'password': 'pass' }

def login(theurl, login_params):
  init_cookies();

  data = urllib.urlencode(login_params)
  txheaders =  {'User-agent' : 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'}

  try:
    # create a request object
    req = Request(theurl, data, txheaders)

    # and open it to return a handle on the url
    handle = urlopen(req)

  except IOError, e:
    log.debug('Failed to open "%s".' % theurl)
    if hasattr(e, 'code'):
      log.debug('Failed with error code - %s.' % e.code)
    elif hasattr(e, 'reason'):
      log.debug("The error object has the following 'reason' attribute :"+e.reason)
      sys.exit()

  else:

    if cj is None:
      log.debug('We don\'t have a cookie library available - sorry.')
    else:
      print 'These are the cookies we have received so far :'
      for index, cookie in enumerate(cj):
        print index, '  :  ', cookie

      # save the cookies again  
      cj.save(COOKIEFILE) 

      #return the data
      return handle.read()



# FIXME: I need to fix this so that it takes into account any cookie data we may have stored
  def get_page(*args, **query):
    if len(args) != 1:
        raise ValueError(
            "post_page() takes exactly 1 argument (%d given)" % len(args)
        )
    url = args[0]
    query = urllib.urlencode(list(query.iteritems()))
    if not url.endswith('/') and query:
        url += '/'
    if query:
        url += "?" + query
    resource = urllib.urlopen(url)
    logger.debug('GET url "%s" => "%s", code %d' % (url,
                                                    resource.url,
                                                    resource.code))
    return resource.read() 

When I attempt to log in, I pass the correct username and pwd,. yet the login fails, and no cookie data is saved.

My two questions are:

Upvotes: 17

Views: 17012

Answers (3)

cxsun
cxsun

Reputation: 11

Please using ignore_discard and ignore_expires while save cookie, in mine case it saved OK.

self.cj.save(cookie_file, ignore_discard=True, ignore_expires=True)

Upvotes: 1

dirkjot
dirkjot

Reputation: 3736

If you are having a hard time making your POST requests to work (like I had with a login form), it definitely pays to quickly install the Live HTTP headers extension to Firefox (http://livehttpheaders.mozdev.org/index.html). This small extension can, among other things, show you the exact POST data that are sent when you manually log in.

In my case, I had banged my head against the wall for hours because the site insisted on an extra field with 'action=login' (doh!).

Upvotes: 2

Anthony Briggs
Anthony Briggs

Reputation: 3465

There are quite a few problems with the code that you've posted. Typically you'll want to build a custom opener which can handle redirects, https, etc. otherwise you'll run into trouble. As far as the cookies themselves so, you need to call the load and save methods on your cookiejar, and use one of subclasses, such as MozillaCookieJar or LWPCookieJar.

Here's a class I wrote to login to Facebook, back when I was playing silly web games. I just modified it to use a file based cookiejar, rather than an in-memory one.

import cookielib
import os
import urllib
import urllib2

# set these to whatever your fb account is
fb_username = "[email protected]"
fb_password = "secretpassword"

cookie_filename = "facebook.cookies"

class WebGamePlayer(object):

    def __init__(self, login, password):
        """ Start up... """
        self.login = login
        self.password = password

        self.cj = cookielib.MozillaCookieJar(cookie_filename)
        if os.access(cookie_filename, os.F_OK):
            self.cj.load()
        self.opener = urllib2.build_opener(
            urllib2.HTTPRedirectHandler(),
            urllib2.HTTPHandler(debuglevel=0),
            urllib2.HTTPSHandler(debuglevel=0),
            urllib2.HTTPCookieProcessor(self.cj)
        )
        self.opener.addheaders = [
            ('User-agent', ('Mozilla/4.0 (compatible; MSIE 6.0; '
                           'Windows NT 5.2; .NET CLR 1.1.4322)'))
        ]

        # need this twice - once to set cookies, once to log in...
        self.loginToFacebook()
        self.loginToFacebook()

        self.cj.save()

    def loginToFacebook(self):
        """
        Handle login. This should populate our cookie jar.
        """
        login_data = urllib.urlencode({
            'email' : self.login,
            'pass' : self.password,
        })
        response = self.opener.open("https://login.facebook.com/login.php", login_data)
        return ''.join(response.readlines())

test = WebGamePlayer(fb_username, fb_password)

After you've set your username and password, you should see a file, facebook.cookies, with your cookies in it. In practice you'll probably want to modify it to check whether you have an active cookie and use that, then log in again if access is denied.

Upvotes: 30

Related Questions