I'm trying to scrape the content from http://google.com. the error message come out.
library(rvest)
html("http://google.com")
Error in open.connection(x, "rb") :
Timeout was reached In addition:
Warning message: 'html' is deprecated.
Use 'read_html' instead.
See help("Deprecated")
since I'm using company network ,this maybe caused by firewall or proxy. I try to use set_config ,but not working .
I encountered the same Error in open.connection(x, “rb”) : Timeout was reached
issue when working behind a proxy in the office network.
Here's what worked for me,
library(rvest)
url = "http://google.com"
download.file(url, destfile = "scrapedpage.html", quiet=TRUE)
content <- read_html("scrapedpage.html")
Credit : https://stackoverflow.com/a/38463559