Reputation: 9385
I have been trying to write a script that fetches results from my university website. Someone suggested that I use Mechanize and it does look really promising.
In order to get the result, one has to first enter the roll number and then select the session.
Simulating the first part has been easy with Mechanize, but with the second part I'm having problems as it is actually a JavaScript onchange
event.
I read the function definition in the JavaScript and this is what I have come up with so far. Mechanize can't handle the onchange event and also when I pass the values that are actually changed by the JavaScript function manually, the same page is returned.
Here's the javaScript Code
function __doPostBack(eventTarget, eventArgument) {
var theform;
if (window.navigator.appName.toLowerCase().indexOf("microsoft") > -1) {
theform = document.Form1;
}
else {
theform = document.forms["Form1"];
}
theform.__EVENTTARGET.value = eventTarget.split("$").join(":");
theform.__EVENTARGUMENT.value = eventArgument;
theform.submit();
}
I set a breakpoint in firebug and found the value of __EVENTTARGET
to be 'Dt1', whereas __EVENTARGUMENT
stays ''.
The ruby script that I have written to do this is
require 'mechanize'
#set up the agent to mimic firefox on windows
agent = Mechanize.new
agent.keep_alive = true
agent.user_agent = 'Windows Mozilla'
page = agent.get('http://www.nitt.edu/prm/nitreg/ShowRes.aspx')
#using mechanize to get us past the first form presented
result_form = page.form('Form1')
result_form.TextBox1 = '205110018'
page = agent.submit( result_form, result_form.buttons.first )
#the second hurdle that we encounter,
#here i'm trying to get past the JavaScript by doing what it does manually
result_form = page.form('Form1')
result_form.field_with('Dt1').options.find { |opt| opt.value == '66' }.select
result_form.field_with( :name => '__EVENTTARGET' ).value = 'Dt1'
#here i should have got the page with the results
page = agent.submit(result_form)
pp page
Can anyone tell me what I'm doing wrong?
Upvotes: 1
Views: 812
Reputation: 491
It looks like you have it working already! Try using puts page.body
instead of pp page
and you'll see the contents of the page. You can use Mechanize search functions to scrape the data from the page.
Also, you could simplify that code to:
result_form['__EVENTTARGET'] = 'Dt1'
result_form['Dt1'] = '66'
Upvotes: 1