So, I started out with Mechanize, and apparently the first thing I try it on is a monkey-rhino-level high JavaScript navigated site.
Now the thing I'm stuck on is submitting the form.
Normally I'd do a submit using the Mechanize built-in submit() function.
import mechanize
browser = mechanize.Browser()
browser.select_form(name = 'foo')
browser.form['bar'] = 'baz'
browser.submit()
This way it'd use the submit button that's available in the HTML form.
However, the site I'm stuck on had to be one that doesn't use HTML submit buttons... No, they're trying to be JavaScript gurus, and do a submit via JavaScript.
The usual submit() doesn't seem to work with this.
So... Is there a way to get around this?
Any help is appreciated. Many thanks!
--[Edit]--
The JavaScript function I'm stuck on:
function foo(bar, baz) {
var qux = document.forms["qux"];
qux.bar.value = bar.split("$").join(":");
qux.baz.value = baz;
qux.submit();
}
What I did in Python (and what doesn't work):
def foo(browser, bar, baz):
qux = browser.select_form("qux")
browser.form[bar] = ":".join(bar.split("$"))
browser.form[baz] = baz
browser.submit()
Three ways:
The first method is preferable if the form is submitted using the POST/GET method, otherwise you'll have to resort to second and third method.
Submitting the form manually and check for POST/GET requests, their parameters and the post url required to submit the form. Popular tools for checking headers are the Live HTTP headers extension and Firebug extension for Firefox, and Developer Tools extension for Chrome. An example of using the POST/GET method:
import mechanize
import urllib
browser = mechanize.Browser()
#These are the parameters you've got from checking with the aforementioned tools
parameters = {'parameter1' : 'your content',
'parameter2' : 'a constant value',
'parameter3' : 'unique characters you might need to extract from the page'
}
#Encode the parameters
data = urllib.urlencode(parameters)
#Submit the form (POST request). You get the post_url and the request type(POST/GET) the same way with the parameters.
browser.open(post_url,data)
#Submit the form (GET request)
browser.open(post_url + '%s' % data)
Rewrite the javascript and execute it in Python. Check out spidermonkey.
Emulate a full browser. Check out Selenium and Windmill.