Why 'Error: length(url) == 1 is not TRUE' with rvest web scraping

Hugo S. picture Hugo S. · Mar 17, 2015 · Viewed 7.7k times · Source

I'm trying to scrap web data but first step requires a login. I've successfully been able to log into other websites but I a weird error with this website.

library("rvest")
library("magrittr")    

research <- html_session("https://www.fitchratings.com/")

signin <- research %>%
  html_nodes("form") %>%
  extract2(1) %>%
  html_form() %>%
  set_values (
    'userName' = "abc",
    'password' = "1234"
     )

research <- research %>%
  submit_form(signin)

When I run the 'submit_form' line I get the following error:

> research <- research %>%
+ submit_form(signin)
Submitting with '<unnamed>'
Error: length(url) == 1 is not TRUE

Submitting with unnamed is correct b/c there is no name assigned to the sign in button. Any help appreciated!

Answer

robeot picture robeot · Sep 15, 2015

I was having the same issue. I jumped through a few hoops to get the dev version of rvest running, and it's working smoothly now. Here's how I went about it:

First thing first. You need to install RTools. Make sure R is closed out. This can be found here: https://cran.r-project.org/bin/windows/Rtools/. And information for the installation of Rtools can be found here (if you're using Windows): github.com/stan-dev/rstan/wiki/Install-Rtools-for-Windows

Boot up R, then install libraries "httr" and "Rcpp" if you don't have them already.

Install "devtools" and the correlated github installer. Information can be found here, but I'll give you a quick summary from the linked repo.

Windows:

install.packages("devtools")
library(devtools)
build_github_devtools()

#### Restart R before continuing ####
install.packages("devtools.zip", repos = NULL, type = "source")

# Remove the package after installation
unlink("devtools.zip")

Mac/Linux:

devtools::install_github("hadley/devtools")

Now, to run the final steps.

library(httr)
library(Rcpp)
library(devtools)
install_github("hadley/rvest")

You should now be able to run submit_form(session, form) and not experience the error

Submitting with 'xxxx'
Error: length(url) == 1 is not TRUE