python - mechanize (setting input into form)

Manoj picture Manoj · Jun 16, 2012 · Viewed 14.8k times · Source

I found out how to retreive the html page of a topic from google search using a tutorial.This was given in the tutorial.

import mechanize
br = mechanize.Browser()
br.open('http://www.google.co.in')
br.select_form(nr = 0)

I understood till this that it retrieves the form.Then it was given that

br.form['q'] = 'search topic'
br.submit()
br.response.read()

This does output the html of the page related to the search topic. But my doubt is what should this parameter in br.form[parameter] be? Because I tried it for Google News and it gave a successful result.Can someone help me out?

Answer

Hugh Bothwell picture Hugh Bothwell · Jun 16, 2012

It's the id of the form field, as given in the page source.

You can get the available id values like so:

import mechanize

br = mechanize.Browser()
br.open("http://www.google.com/")

for f in br.forms():
    print f

which gives me:

<f GET http://www.google.ca/search application/x-www-form-urlencoded
  <HiddenControl(ie=ISO-8859-1) (readonly)>
  <HiddenControl(hl=en) (readonly)>
  <HiddenControl(source=hp) (readonly)>
  <TextControl(q=)>
  <SubmitControl(btnG=Google Search) (readonly)>
  <SubmitControl(btnI=I'm Feeling Lucky) (readonly)>
  <HiddenControl(gbv=1) (readonly)>>

which says that:

  1. There is only one form on the page

  2. Hidden field id's are ie (page encoding), hl (language code), hp (? don't know), and gbv (also don't know).

  3. The only not-hidden field id is q, which is a text input, which is the search text.