Encrypting R script under MS-Windows

Vishal Belsare picture Vishal Belsare · Jan 16, 2011 · Viewed 14.2k times · Source

I have a bunch of R scripts which I am running on a Windows machine and want to ensure that the code remains unread by those not intended to see it. On a Linux box, I could wrap the R code in a bash script #! and make an encrypted (and perhaps even a limited-life) executable shell script. What are my options to do something on similar lines under Windows?

Answer

Iterator picture Iterator · Aug 4, 2011

My answer is a bit late, but I believe this is a good question. Unfortunately, I don't believe that there is a solution, or at least an easy one, at the present time.

The difficulty is common because, for most interpreted languages, including R, it is often possible to turn on logging and inspection of all commands being run. This can negate many tricks to obfuscate the code.

For those who prefer to think of code being open == good, one should know that a common reason to obfuscate the code is if one is consulting with a client that hires multiple vendors. It is not uncommon for a client to take scripts from vendor A and ask vendor B why it doesn't work with their system. (This may be done by a low-level IT flunkie, rather than someone responsible for the NDA contracts.) If A & B are competitors, A's code has just been handed to B. When scripts == serious programs, then serious code has been given away.

The ways I've seen this addressed are:

  1. Make a call to a compiled language, and use standard protections available there.
  2. Host the executable on a different server, and use calls to the server to execute the calculations. (In R, there are multiple server-side options.)
  3. Use compiled (preprocessed / bytecode) code within the language.

Option 2 is actually easier and better when the code may be widely distributed, not just for IP reasons. A major advantage is that it lets you upgrade the code without having to go through the pain of a site-wide release process. If new libraries are needed, no problem - update the server.

Option 3 is done in Matlab with .p files, and can be done with py2exe for Python on Windows. In R, the new bytecode compilation may be analogous, but I am not familiar enough with it to address any differences between .Rc files in the R context and .p files in the Matlab context. For more info on the compiler, see: http://www.inside-r.org/r-doc/compiler/compile

Hosting computations on the server is great for working with unsophisticated users, because it is easier to iterate quickly in response to bugs or feature requests. The IP protection is simply a benefit.