Reputation: 71
I am trying to connect R to Hive cluster using RJDBC package.
The code I have written is:
drv <- JDBC(driverClass = "org.apache.hive.jdbc.HiveDriver",
classPath = list.files("C:/hive-jdbc/hive-jdbc-0.10.0.jar",
pattern="jar$",full.names=T),
identifier.quote="'")
I have added "C:/hive-jdbc" to my system path variable as well.
But I am getting the following error:
Error in path.expand(unlist(strsplit(classPath, .Platform$path.sep))) :
invalid 'path' argument
Can some one help me with this?
Upvotes: 1
Views: 1292
Reputation: 101
In answer to Prateek - "Class not found" as it is not in the jar file: you need more jar files in your class path. for me this was:
/usr/lib/hive/lib/hive-jdbc.jar
/usr/lib/hive/lib/libthrift-0.9.2.jar
/usr/lib/hive/lib/hive-service.jar
/usr/lib/hive/lib/httpclient-4.2.5.jar
/usr/lib/hive/lib/httpcore-4.2.5.jar
/usr/lib/hive/lib/hive-jdbc-standalone.jar
/usr/lib/hadoop/client/hadoop-common.jar
(some of these file refs are symbolic links to the real file - take the real file!) I also wrote a basic blow-by-blow article on getting this working: https://pygot.wordpress.com/2016/10/13/connecting-r-studio-to-hadoop-via-hive/
Upvotes: 0
Reputation: 94182
In
classPath = list.files("C:/hive-jdbc/hive-jdbc-0.10.0.jar",
pattern="jar$",full.names=T)
you use list.files
. The first argument to list.files
should be a folder, you seem to have given it a jar file. What is the output of just that list.files
function on your system? It's probably character(0)
. That screws up the classPath
. Fix that - and its not clear what you want the value of the classPath
parameter to be here. If you want it to be all the .jar
files in a folder, then
list.files("C:/wherever/", pattern="\.jar$", full.names=TRUE)
should do it. If its just the one jar file, just put it in:
classPath="C:/hive-jdbc/hive-blahlah-999.jar"
in the call. ie, keep it simple!
Upvotes: 1