Reputation: 1273
I'm trying to run rhadoop on Cloudera's hadoop distro (I can't remember if its CDH3 or 4), and am running into an issue: Rstudio server doesn't seem to recognize my global variables.
In my /etc/profile.d/r.sh file, I have:
export HADOOP_HOME=/usr/lib/hadoop
export HADOOP_CONF=/usr/hadoop/conf
export HADOOP_CMD=/usr/bin/hadoop
export HADOOP_STREAMING=/usr/lib/hadoop-mapreduce/
When I run R from the terminal, I get:
> Sys.getenv("HADOOP_CMD")
[1] "usr/bin/hadoop"
But when I run Rstudio server:
> Sys.getenv("HADOOP_CMD")
[1] ""
And as a result, when I try to run rhdfs:
> library("rJava", lib.loc="/home/cloudera/R/x86_64-redhat-linux-gnu-library/2.15")
> library("rhdfs", lib.loc="/home/cloudera/R/x86_64-redhat-linux-gnu-library/2.15")
Error : .onLoad failed in loadNamespace() for 'rhdfs', details:
call: fun(libname, pkgname)
error: Environment variable HADOOP_CMD must be set before loading package rhdfs
Error: package/namespace load failed for 'rhdfs'
Does anyone know where I should be putting my enviornment variables if not in that specific r.sh file?
Thanks!
Upvotes: 20
Views: 19648
Reputation: 56
Note that on Windows, R looks for the .Renviron file in /Users/<name>/Documents
, while RStudio appears to expect the .Renviron file to be in /Users/<name>/
.
Upvotes: 3
Reputation: 19
You should set your environment variables in Rstudio like
Sys.setenv("/path to hadoop")
and then you try this
Upvotes: -1
Reputation: 41458
You should set your environment variables in .Renviron
or Renviron.site
. I think these files are defined under R_HOME/etc/Renviron.site
. You can get more information by typing:
> ?Startup
Someone had a similar issue here and this is what he did to solve it.
Upvotes: 15