How to integrate R in a web application

Benoit Guigal picture Benoit Guigal · Oct 22, 2012 · Viewed 12.7k times · Source

I am developing a web application and I would like to perform two kind of statistical/modeling operations.

(1) Batch analysis from data stored in the backend of my app (HBase cluster). Typically, this operation needs to be performed on regular basis, say every night. The size of the data may exceed what can be stored in local memory, so this might need the invocation of some package supporting parallel computing. (2) On the fly R execution triggered by a user request in the front-end. Typical use case include forecasting of small time series. Users may place requests in the same time so there should be some support for concurrency. The performance is of paramount importance because the user can't wait indefinitely for the response to come.

My question is : what would be the best combination of technologies/CRAN package to address those two problems ? My idea for the moment is :

  • Using Rserver in combination with a Ruby client. Alternatively, I am thinking about writing myself the server in Java and using existing R/Java bindings.
  • Using RHadoop to handle jobs on big datasets.

I saw RevoDeployR is a great tool but is not open source, isn't it ?

Thank you for your help

Answer

user1600826 picture user1600826 · Oct 22, 2012

I would recommend using RApache (http://rapache.net/) together with R package RJSONIO or rjson