Reputation: 531
If my understanding is correct, spark applications may contain one or more job. A job may be split into stages and stages can be split into tasks. I more or less can follow this in the spark user interface (or I at least I think so). But I am confused about the meaning of the SQL tab.
In particular:
spark.sql()
?I have been running some examples in order to understand but it is still not very clear. Could you please help me?
Upvotes: 2
Views: 1127
Reputation: 1322
The SQL tab shows what you'd consider the execution plan. It shows the stages, run times, memory and operations (Exchanges, Projections, etc). Catalyst builds the job plan based on your query, regardless of whether your query was done with spark.sql or dataset/dataframe operations.
You can find more information here:
If the application executes Spark SQL queries, the SQL tab displays information, such as the duration, jobs, and physical and logical plans for the queries.
https://spark.apache.org/docs/latest/web-ui.html
Upvotes: 1