user2729490
user2729490

Reputation: 31

pycurl, how to send POST data for Multiple Select Form?

I'm trying to do some webscraping, and it involves sending a form with a Multiple Select Box List that looks similar to this:

<select name="multipleSelectForm" multiple="multiple" size="5">
    <option value="value1">value1</option>
    <option value="value2">value2</option>
</select>

Now, I want to send both value1 and value2 using pycurl, for example:

import urllib
import pycurl

c = pycurl.Curl()

data = {'multipleSelectForm':'value1',
        'multipleSelectForm':'value2'}

c.setopt(c.URL, 'http://www.example.com')

c.setopt(c.POST, 1)
post = urllib.urlencode(data)
c.setopt(c.POSTFIELDS, post)

c.perform()

now the obvious problem with this is that it's sending multipleSelectForm multiple times. I'm quite sure that the requested page is probably looking for a multipleSelectForm array, and not just individual variables (this is just a guess, I'm not actually sure) and hence that POST data it receives isn't correct.

I tried using Google Chrome's dev tools to see the traffic of what it's doing and when I looked at the Form Data, it looked like this:

multipleSelectForm:value1
multipleSelectForm:value2

I'm a bit lost on how to approach all of this, if anybody would care to help

Upvotes: 3

Views: 4610

Answers (1)

olamork
olamork

Reputation: 179

From that it looks like the data you're sending is going to just be

{ 'multipleSelectForm':'value2' }

Because it's a dictionary. If you set it up as tuple pairs it will do what you want.

data = (('multipleSelectForm', 'value1'), ('multipleSelectForm', 'value2'))

You can test this your self by setting up a tiny debug http server:

from BaseHTTPServer import BaseHTTPRequestHandler, HTTPServer                                                                                                                                                                                                                     

class hand(BaseHTTPRequestHandler):                                          
    def __init__(self, socket, *args):                                       
        print socket.recv(10000)                                             

server = HTTPServer(('', 8080), hand)                                        
server.serve_forever()                                                       

and then hitting it with your script. I used this to confirm that passing the tuple list does what I expect.

Upvotes: 1

Related Questions