Reputation: 25
I have an Ubuntu desktop with Rstudio on, I also have a remote hadoop cluster running under Centos that I hope to connect to from RStudio, from my understanding this is a viable method but can someone please confirm this?
Upvotes: 0
Views: 5255
Reputation: 7255
Rstudio will not allow you to connect to hadoop but you can use the hadoop streaming api to submit your hadoop jobs.
There are a few packages to help you get started. I have used rmr to run map/reduce jobs on a hadoop cluster with the streaming api. Those can be found here.
https://github.com/RevolutionAnalytics/RHadoop/wiki
There is also the rhipe package which will allow you to communicate with the hdfs file system inside your R scripts.
http://www.datadr.org/doc/functions.html
Upvotes: 1