a.moussa
a.moussa

Reputation: 3287

Sort Spark DataFrame's column by date

I have a DataFrame like that:

+---+-------------------+
|C1 |     C2            |
+---+-------------------+
| A |21/12/2015-17-14-12|
| A |21/12/2011-20-12-44|
| B |11/02/2015-15-31-11|
| B |09/04/2015-13-38-05|  
| C |11/06/2013-23-04-35|
+---+-------------------+

the second column is a timestamp dd/mm/yyyy-hh-mm-ss. I would like to sort each row like that

+---+-------------------+
|C1 |     C2            |
+---+-------------------+
| A |21/12/2011-20-12-44|
| C |11/06/2013-23-04-35|
| B |11/02/2015-15-31-11|
| B |09/04/2015-13-38-05|  
| A |21/12/2015-17-14-12|
+---+-------------------+

Perhaps I have to use an Udf?Do you have any idea?

Upvotes: 0

Views: 3542

Answers (1)

zero323
zero323

Reputation: 330343

A simple one-liner is all you need. Required imports

import org.apache.spark.sql.functions.unix_timestamp

And the code:

input.sort(unix_timestamp($"C2", "dd/MM/yyyy-HH-mm-ss"))

Upvotes: 3

Related Questions