Reputation: 407
I am a Spark beginner!And,I'm confused about the relationship between Spark rdd and Spark sql . Whether Spark sql is supposed to converted to Spark rdd in the background?
Upvotes: 1
Views: 708
Reputation: 21810
As far as I know, they are sitting atop different engines.
Spark SQL leverages an internal thing called Catalyst which is responsible for generating logical plans for the work and doing performance optimization in relation to codegen.
First, because DataFrame and Dataset APIs are built on top of the Spark SQL engine, it uses Catalyst to generate an optimized logical and physical query plan.
The RDD api on the other hand, is low level, and apparently does not leverage catalyst.
Upvotes: 1