How to fill in an online form and get results back in R

Joanne Demmler picture Joanne Demmler · Jan 9, 2013 · Viewed 8.5k times · Source

Has anyone ever filled in a web form remotely from R?

I'd like to do some archery statistics in R using my scores. There is a very handy webpage, that gives you the classification and handicaps http://www.archersmate.co.uk/, which I naturally would want to include in my stats sheet.

Is it possible to fill this form in remotely and to get the results back to R???

Otherwise I would have to get all handicap tables and stick them into a database myself.

UPDATE: We've narrowed the problem down to the fact, that the form submit button is written in javascript.

Answer

alex23lemm picture alex23lemm · Dec 24, 2014

You can use the RSelenium package to fill out and submit web forms and to retrieve the results.

The following code leveraging RSelenium will download data for an example input (Male, Under 18, Longbow, Bristol V, 500):

library(RSelenium)

# Start Selenium Server --------------------------------------------------------

checkForServer()
startServer()
remDrv <- remoteDriver()
remDrv$open()


# Simulate browser session and fill out form -----------------------------------

remDrv$navigate('http://www.archersmate.co.uk/')
remDrv$findElement(using = "xpath", "//input[@value = 'Male']")$clickElement()
Sys.sleep(2) 
remDrv$findElement(using = "xpath", "//select[@id = 'drpAge']/option[@value = 'Under 18']")$clickElement()
remDrv$findElement(using = "xpath", "//input[@value ='Longbow']")$clickElement() 
remDrv$findElement(using = "xpath", "//select[@id = 'rnd']/option[@value = 'Bristol V']")$clickElement()
remDrv$findElement(using = "xpath", "//input[@id ='scr']")$sendKeysToElement(list('5', '0', '0'))
remDrv$findElement(using = "xpath", "//input[@id = 'cmdCalc']")$clickElement()

# Retrieve and download results injecting javascript ---------------------------

Sys.sleep(2)
clsf <- remDrv$executeScript(script = 'return $("#txtClass").val();', args = list())[[1]]
hndcp <- remDrv$executeScript(script = 'return $("#txtHandicap").val();', args = list())[[1]]

remDrv$quit()
remDrv$closeServer()

The default browser for RSelenium is Firefox. However, RSelenium even supports headless browsing using PhantomJS. For leveraging PhanomJS you just need to

  • download PhantomJS and place it in the users path
  • replace the code snippets at the beginning and at the end like described next

Default browsing (like shown above):

checkForServer()
startServer()
remDrv <- remoteDriver()

...

remDrv$quit()
remDrv$closeServer()

Headless browsing:

pJS <- phantom()
remDrv <- remoteDriver(browserName = 'phantomjs')

...

remDrv$close()
pJS$stop()