httr: retrieving data with POST()

circld picture circld · Dec 9, 2014 · Viewed 7.1k times · Source

Disclaimer: while I have managed to grab data from another source using httr's POST function, let it be known that I am a complete n00b with regards to httr and HTML forms in general.

I would like to bring some data directly into R from a website using httr. My first attempt involved passing a named list to the body arg (as is shown in this vignette). However, I noticed square brackets in the form input names (at least I think they're the form input arguments). So instead, I tried passing in the body as a string as I think it should appear in the request body:

url <- 'http://research.stlouisfed.org/fred2/series/TOTALSA/downloaddata'
query <- paste('form[native_frequency]=Monthly', 'form[units]=lin',
                'form[frequency]=Monthly', 'form[obs_start_date]="1976-01-01"',
                'form[obs_end_date]="2014-11-01"', 'form[file_format]=txt'
                sep = '&')
response <- POST(url, body = query)

In any case, the above code just returns the webpage source code and I cannot figure out how to properly submit the form so that it returns the same data as manually clicking the form's 'Download Data' button.

In Developer Tools/Network on Chrome, it states in the Response Header under Content-Disposition that there is a text file attachment containing the data when I manually click the 'Download Data' button on the form. It doesn't appear to be in any of the headers associated with the response object in the code above. Why isn't this file getting returned by the POST request--where's the file with the data going?

Feels like I'm missing something obvious. Anyone care to help me connect the dots?

Answer

MrFlick picture MrFlick · Dec 9, 2014

Generally if you're going to use httr, you let it build and encode the data for you, you just pass in the information via a list of form values. Try

url<-"http://research.stlouisfed.org/fred2/series/TOTALSA/downloaddata"
query <- list('form[native_frequency]'="Monthly",
    'form[units]'="lin",
    'form[frequency]'="Monthly",
    'form[obs_start_date]'="1996-01-01",
    'form[obs_end_date]'="2014-11-01",
    'form[file_format]'="txt")
response <- POST(url, body = query)
content(response, "text")

and the return looks something like

[1] "Title:               Total Vehicle Sales\r\nSeries ID:           TOTALSA\r\nSource:   
US. Bureau of Economic Analysis\r\nRelease:             Supplemental Estimates, Motor 
Vehicles\r\nSeasonal Adjustment: Seasonally Adjusted Annual Rate\r\nFrequency:           Monthly\r\nUnits:               
Millions of Units\r\nDate Range:          1996-01-01 to 2014-11-
01\r\nLast Updated:        2014-12-05 7:16 AM CST\r\nNotes:               \r\n\r\nDATE       
VALUE\r\n1996-01-01  14.8\r\n1996-02-01  15.6\r\n1996-03-01  16.0\r\n1996-04-01  15.5\r\n1996-05-01 
16.0\r\n1996-06-01  15.3\r\n1996-07-01  15.1\r\n1996-08-01  15.5\r\n1996-09-01  15.5\r\n1996-10-01   15.3\r