I am developing a web application and I would like to perform two kind of statistical/modeling operations.
(1) Batch analysis from data stored in the backend of my app (HBase cluster). Typically, this operation needs to be performed on regular basis, say every night. The size of the data may exceed what can be stored in local memory, so this might need the invocation of some package supporting parallel computing. (2) On the fly R execution triggered by a user request in the front-end. Typical use case include forecasting of small time series. Users may place requests in the same time so there should be some support for concurrency. The performance is of paramount importance because the user can't wait indefinitely for the response to come.
My question is : what would be the best combination of technologies/CRAN package to address those two problems ? My idea for the moment is :
I saw RevoDeployR is a great tool but is not open source, isn't it ?
Thank you for your help
I would recommend using RApache (http://rapache.net/) together with R package RJSONIO or rjson