liyong
liyong

Reputation: 407

what's the relationship between Spark rdd and Spark sql?

I am a Spark beginner!And,I'm confused about the relationship between Spark rdd and Spark sql . Whether Spark sql is supposed to converted to Spark rdd in the background?

Upvotes: 1

Views: 708

Answers (1)

Kristian
Kristian

Reputation: 21810

As far as I know, they are sitting atop different engines.

Spark SQL leverages an internal thing called Catalyst which is responsible for generating logical plans for the work and doing performance optimization in relation to codegen.

First, because DataFrame and Dataset APIs are built on top of the Spark SQL engine, it uses Catalyst to generate an optimized logical and physical query plan.

https://databricks.com/blog/2016/07/14/a-tale-of-three-apache-spark-apis-rdds-dataframes-and-datasets.html

The RDD api on the other hand, is low level, and apparently does not leverage catalyst.

Upvotes: 1

Related Questions