Reputation: 316
What is the difference between empDF["Last Name"].desc()
and desc("Last Name")
as both are giving same result and both involved shuffle operation
>>> empDF.orderBy(empDF["Last Name"].desc()).show(4)
+------+----------+---------+------+------+
|Emp ID|First Name|Last Name|Gender|Salary|
+------+----------+---------+------+------+
|977421| Zackary| Zumwalt| M|177521|
|741150| Awilda| Zuber| F|144972|
|274620| Eleanora| Zook| F|151026|
|242757| Erin| Zito| F|127254|
+------+----------+---------+------+------+
only showing top 4 rows
>>> empDF.orderBy(desc("Last Name")).show(4)
+------+----------+---------+------+------+
|Emp ID|First Name|Last Name|Gender|Salary|
+------+----------+---------+------+------+
|977421| Zackary| Zumwalt| M|177521|
|741150| Awilda| Zuber| F|144972|
|274620| Eleanora| Zook| F|151026|
|242757| Erin| Zito| F|127254|
+------+----------+---------+------+------+
only showing top 4 rows
One thing i noticed , to use desc() before column name i had to import from pyspark.sql.functions import desc
. Is it like the former one is part of Spark Dataframe column function and later one is Spark SQL function ??? Is there any supporting doc or explanation for clarifying this confusion (i did not find any )???
Thanks in Advance.
Upvotes: 1
Views: 254
Reputation: 316
After going through Documentation multipletimes i understand now .There are two desc() available in pyspark.sql.* module . One is in pyspark.sql.functions
module (here) .
This method takes a mandatory column argument.
The Secnd one is inside pyspark.sql.Column
class (here). This one does not take any argument .
Both implementation do almost same thing and same way. But implementation is different and can be used interchangeably with proper import statement.
Upvotes: 0
Reputation: 6323
Both are the same thing.
As per documentation and source code (funtions.desc..
)-
/**
* Returns a sort expression based on the descending order of the column.
* {{{
* df.sort(asc("dept"), desc("age"))
* }}}
*
* @group sort_funcs
* @since 1.3.0
*/
def desc(columnName: String): Column = Column(columnName).desc
check internally desc(columnName)
calls the Column(columnName).desc
so both are same (take these as 2 alternatives performing the same operation)
Upvotes: 1