Reputation: 5996
I'm working on adding HiveServer2 support to my company's R data-access package. I'm curious what the best way of generating an R Thrift client would be. I'm considering writing an R wrapper around the Java Thrift client, similar to what rhbase does, but I'd prefer a pure R solution, if possible.
Things to note:
beeline
client in some R goodness.Upvotes: 19
Views: 1123
Reputation: 21563
The exact scope of this question may be too broad for Stackoverflow and the asker confirmed he abandoned this quest, but for future readers this is probably the thing to look for:
From R you can connect to Hive with JDBC.
This is not exactly what the asker came for, but it should serve the purpose in most cases.
The key part in the solution for this would be the RJDBC package, here is some example code found on the Cloudera Community
library(DBI)
library(rJava)
library(RJDBC)
hadoop.class.path = list.files(path=c("/usr/hdp/2.4.0.0-169/hadoop"),pattern="jar", full.names=T);
hive.class.path = list.files(path=c("/usr/hdp/current/hive-client/lib"),pattern="jar", full.names=T);
hadoop.lib.path = list.files(path=c("/usr/hdp/current/hive-client/lib"),pattern="jar",full.names=T);
mapred.class.path = list.files(path=c("/usr/hdp/current/hadoop-mapreduce-client/lib"),pattern="jar",full.names=T);
cp = c(hive.class.path,hadoop.lib.path,mapred.class.path,hadoop.class.path)
drv <- JDBC("org.apache.hive.jdbc.HiveDriver","hive-jdbc.jar",identifier.quote="`")
conn <- dbConnect(drv, "jdbc:hive2://ixxx:10000/default", "hive", "hive")
show_databases <- dbGetQuery(conn, "show databases")
Full disclosure: I am an employee of cloudera.
Upvotes: 1