Reputation: 23
I'm looking to do a POST request on this site:
http://web1.ncaa.org/stats/StatsSrv/careersearch
The form on the right has four dropdowns. When I run the code below, "School" stubbornly doesn't become selected. There's a hidden input that may be causing the problem, but I haven't been able to fix it. The javascript on the page doesn't seem to have an effect, but I could be wrong. Any help is appreciated:
#!/usr/bin/python
import urllib
import urllib2
url = 'http://web1.ncaa.org/stats/StatsSrv/careersearch'
values = {'searchOrg' : '30123','academicYear' : '2011','searchSport' : 'MBA','searchDiv' : '1'}
data = urllib.urlencode(values)
req = urllib2.Request(url, data)
response = urllib2.urlopen(req)
the_page = response.read()
print the_page
Upvotes: 2
Views: 628
Reputation: 8312
I used mechanize:
import mechanize
from BeautifulSoup import BeautifulSoup
mech = mechanize.Browser()
mech.set_handle_robots(False)
response = mech.open('http://web1.ncaa.org/stats/StatsSrv/careersearch')
mech.select_form(nr=2)
mech.form['searchOrg'] = ['30123']
mech.form['academicYear'] = ['2011']
mech.form['searchSport'] = ['MBA']
mech.form['searchDiv'] = ['1']
mech.submit()
soup = BeautifulSoup(mech.response().read())
I know in mechanize the site was asking for searchOrg, academicYear, searchSport, searchDiv in sequence/list form. You should definitely be mindful of the robots.txt.
Upvotes: 0
Reputation: 32620
As you suspected, you're missing a hidden field: doWhat = 'teamSearch'
(for submitting the form on the right).
Using these request values works for me:
values = {'doWhat':'teamSearch', 'searchOrg' : '30123','academicYear' : '2011','searchSport' : 'MBA','searchDiv' : '1'}
Upvotes: 2