ostrokach
ostrokach

Reputation: 19942

Can't connect to Spark thriftserver using JDBC

I followed Spark instructions for starting a thrift JDBC server:

$ ./spark-2.1.1-bin-hadoop2.7/sbin/start-thriftserver.sh

I can connect to it ok from beeline:

$ ./spark-2.1.1-bin-hadoop2.7/bin/beeline -u 'jdbc:hive2://localhost:10000'
Connecting to jdbc:hive2://localhost:10000
log4j:WARN No appenders could be found for logger (org.apache.hive.jdbc.Utils).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Connected to: Spark SQL (version 2.1.1)
Driver: Hive JDBC (version 1.2.1.spark2)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 1.2.1.spark2 by Apache Hive
0: jdbc:hive2://localhost:10000>

However, trying to connect from DataGrip using JDBC and the same connection string, I get an error:

[2017-07-07 16:46:57] java.lang.ClassNotFoundException: org.apache.thrift.transport.TTransportException
[2017-07-07 16:46:57]   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
[2017-07-07 16:46:57]   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
[2017-07-07 16:46:57]   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
[2017-07-07 16:46:57]   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
[2017-07-07 16:46:57]   at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105)
[2017-07-07 16:46:57]   at com.intellij.database.remote.jdbc.impl.RemoteDriverImpl.connect(RemoteDriverImpl.java:27)
[2017-07-07 16:46:57]   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[2017-07-07 16:46:57]   at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[2017-07-07 16:46:57]   at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[2017-07-07 16:46:57]   at java.lang.reflect.Method.invoke(Method.java:498)
[2017-07-07 16:46:57]   at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:324)
[2017-07-07 16:46:57]   at sun.rmi.transport.Transport$1.run(Transport.java:200)
[2017-07-07 16:46:57]   at sun.rmi.transport.Transport$1.run(Transport.java:197)
[2017-07-07 16:46:57]   at java.security.AccessController.doPrivileged(Native Method)
[2017-07-07 16:46:57]   at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
[2017-07-07 16:46:57]   at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:568)
[2017-07-07 16:46:57]   at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826)
[2017-07-07 16:46:57]   at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:683)
[2017-07-07 16:46:57]   at java.security.AccessController.doPrivileged(Native Method)
[2017-07-07 16:46:57]   at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682)
[2017-07-07 16:46:57]   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[2017-07-07 16:46:57]   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[2017-07-07 16:46:57]   at java.lang.Thread.run(Thread.java:745) (no stack trace)

I configured DataGrip to use the JDBC library hive-jdbc-1.2.1.spark2.jar from the spark installation folder.

Upvotes: 2

Views: 2623

Answers (3)

MxR
MxR

Reputation: 606

Adding to tukushan answer. You can simplify your life by using just two jars: hadoop-common-2.7.3.jar from spark distro and hive-jdbc-1.2.1-standalone.jar

Upvotes: 1

tekumara
tekumara

Reputation: 8807

From the spark 2.2.1 distribution, you'll need the following jars:

commons-logging-1.1.3.jar
hadoop-common-2.7.3.jar
hive-exec-1.2.1.spark2.jar
hive-jdbc-1.2.1.spark2.jar
hive-metastore-1.2.1.spark2.jar
httpclient-4.5.2.jar
httpcore-4.4.4.jar
libthrift-0.9.3.jar
slf4j-api-1.7.16.jar
spark-hive-thriftserver_2.11-2.2.1.jar
spark-network-common_2.11-2.2.1.jar

In Datagrip select class org.apache.hive.jdbc.HiveDriver and set Tx (transaction control) to Manual (spark doesn't support autocommit).

You should now be able to connect using the url jdbc:hive2://hostname:10000/

Upvotes: 3

ostrokach
ostrokach

Reputation: 19942

After adding all *.jar files from the spark/jars folder to the "JDBC drivers" window in DataGrip, it works! Not sure which of those libraries were required, but trial-and-error tells me that many of them are!

Upvotes: 1

Related Questions