Reputation: 971
I found out how to retreive the html page of a topic from google search using a tutorial.This was given in the tutorial.
import mechanize
br = mechanize.Browser()
br.open('http://www.google.co.in')
br.select_form(nr = 0)
I understood till this that it retrieves the form.Then it was given that
br.form['q'] = 'search topic'
br.submit()
br.response.read()
This does output the html of the page related to the search topic. But my doubt is what should this parameter in br.form[parameter] be? Because I tried it for Google News and it gave a successful result.Can someone help me out?
Upvotes: 2
Views: 15746
Reputation: 1931
Look at http://www.google.co.in resource , it have this code:
<input class="lst lst-tbb" value="" title="Google 搜索" size="41" type="text"
autocomplete="off" id="lst-ib" name="q" maxlength="2048"/>
name="q" indicate the parameter
in br.form[parameter
]
Upvotes: 0
Reputation: 56634
It's the id of the form field, as given in the page source.
You can get the available id values like so:
import mechanize
br = mechanize.Browser()
br.open("http://www.google.com/")
for f in br.forms():
print f
which gives me:
<f GET http://www.google.ca/search application/x-www-form-urlencoded
<HiddenControl(ie=ISO-8859-1) (readonly)>
<HiddenControl(hl=en) (readonly)>
<HiddenControl(source=hp) (readonly)>
<TextControl(q=)>
<SubmitControl(btnG=Google Search) (readonly)>
<SubmitControl(btnI=I'm Feeling Lucky) (readonly)>
<HiddenControl(gbv=1) (readonly)>>
which says that:
There is only one form on the page
Hidden field id's are ie (page encoding), hl (language code), hp (? don't know), and gbv (also don't know).
The only not-hidden field id is q, which is a text input, which is the search text.
Upvotes: 7
Reputation: 76588
The parameter should be the name of the form element you are filling with the string. You can find the name the easiest way using something like firebug to inspect the web page (that is for firefox, use whatever you have available for your browser). You can also try to look at the source of the page, but that is tedious when the page is complex.
E.g. the name of the form - element of the box I am typing this in is "post-text"
Upvotes: 0