Rstudio-server environment variables not loading?

AI52487963 picture AI52487963 · Jun 1, 2013 · Viewed 14.6k times · Source

I'm trying to run rhadoop on Cloudera's hadoop distro (I can't remember if its CDH3 or 4), and am running into an issue: Rstudio server doesn't seem to recognize my global variables.

In my /etc/profile.d/r.sh file, I have:

export HADOOP_HOME=/usr/lib/hadoop
export HADOOP_CONF=/usr/hadoop/conf
export HADOOP_CMD=/usr/bin/hadoop
export HADOOP_STREAMING=/usr/lib/hadoop-mapreduce/

When I run R from the terminal, I get:

> Sys.getenv("HADOOP_CMD")
[1] "usr/bin/hadoop"

But when I run Rstudio server:

> Sys.getenv("HADOOP_CMD")
[1] ""

And as a result, when I try to run rhdfs:

> library("rJava", lib.loc="/home/cloudera/R/x86_64-redhat-linux-gnu-library/2.15")
> library("rhdfs", lib.loc="/home/cloudera/R/x86_64-redhat-linux-gnu-library/2.15")
Error : .onLoad failed in loadNamespace() for 'rhdfs', details: 
    call: fun(libname, pkgname)
    error: Environment variable HADOOP_CMD must be set before loading package rhdfs
Error: package/namespace load failed for 'rhdfs'

Does anyone know where I should be putting my enviornment variables if not in that specific r.sh file?

Thanks!

Answer

Charles Menguy picture Charles Menguy · Jun 1, 2013

You should set your environment variables in .Renviron or Renviron.site. I think these files are defined under R_HOME/etc/Renviron.site. You can get more information by typing:

> ?Startup

Someone had a similar issue here and this is what he did to solve it.