How Spark Thrift server is related to Apache Thirft

Question

I read post on quora which tell that Spark Thrift server is related to Apache Thrift which is d binary communication protocol. Spark Thrift server is the interface to Hive, but how does Spark Thrift server use Apache Thrift for communication with Hive via binary protocol/rpc?

T. Gawęda · Accepted Answer

Spark Thrift Server is a Hive-compatible interface for Spark.

That means, it creates implementation of HiveServer2, you can connect with beeline, however almost all the computation will be computed with Spark, not Hive.

In the previous versions, query parser was from Hive. Currently Spark Thrift Server works with Spark query parser.

Apache Thrift is a framework to develop RPC - Remote Procedure Calls - so there are many implementations using Thrift. Also Cassandra used Thrift, now it's replaced with Cassandra native protocol.

So, Apache Thrift is a framework to develop RPCs, Spark Thrift Server is an implementation of Hive protol, but it uses Spark as a computation framework.

For more details, please see this link from @RussS

How Spark Thrift server is related to Apache Thirft

Answers (2)

Related Questions