Reputation: 145
I'm trying to use Spark localy in my machine and I was able to reproduce the tutorial at:
However, when I try to use Hive I get the following error:
Error in value[3L] : Spark SQL is not built with Hive support
The code:
## Set Environment variables
Sys.setenv(SPARK_HOME = 'F:/Spark_build')
# Set the library Path
.libPaths(c(file.path(Sys.getenv('SPARK_HOME'), 'R','lib'),.libPaths()))
# load SparkR
library(SparkR)
sc <- sparkR.init()
sqlContext <- sparkRHive.init(sc)
sparkR.stop()
First I suspected that it was the pre-built version of Spark, then I tried to build my own using Maven, which took almost an hour:
mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.6.0 -Phive -Phive-thriftserver -DskipTests clean package.
However, the error persists.
Upvotes: 3
Views: 585
Reputation: 1
We had the same problem but we could not simply move to linux. After a while we found this page spark on windows and we came up with the following solution:
Create a file named hive-site.xml and write in it:
<configuration>
<property>
<name>hive.exec.scratchdir</name>
<value>C:\tmp\hive</value>
<description>Scratch space for Hive jobs</description>
</property>
</configuration>
winutils.exe chmod -R 777 C:\tmp\hive
This solved the problem on our windows machine where we can now run SparkR with hive support.
Upvotes: 0
Reputation: 60319
If you just followed the tutorial's instructions, you simply do not have Hive installed (try hive
from the command line)... I have found that this is a common point of confusion for Spark beginners: "pre-built for Hadoop" does not mean that it needs Hadoop, let alone that it includes Hadoop (it does not), and the same holds for Hive.
Upvotes: 1