Kevin Wincott
Kevin Wincott

Reputation: 25

RStudio to connect to remote Hadoop server

I have an Ubuntu desktop with Rstudio on, I also have a remote hadoop cluster running under Centos that I hope to connect to from RStudio, from my understanding this is a viable method but can someone please confirm this?

Upvotes: 0

Views: 5255

Answers (1)

Chris Hinshaw
Chris Hinshaw

Reputation: 7255

Rstudio will not allow you to connect to hadoop but you can use the hadoop streaming api to submit your hadoop jobs.

There are a few packages to help you get started. I have used rmr to run map/reduce jobs on a hadoop cluster with the streaming api. Those can be found here.

https://github.com/RevolutionAnalytics/RHadoop/wiki

There is also the rhipe package which will allow you to communicate with the hdfs file system inside your R scripts.

http://www.datadr.org/doc/functions.html

Upvotes: 1

Related Questions