how to ascending sort a multiple array of SPARK RDD by any column in scala?

Question

I'm interested in apache SPARK.

I tried to ascending sort a multiple array of SPARK RDD by any column in scala.

(i.e. RDD[Array[Int] -> Array(Array(1,2,3), Array(2,3,4), Array(1,2,1))

If I sort by first column, then result will be Array(Array(1,2,3), Array(1,2,1), Array(2,3,4)). or If I sort by third column, then result will be Array(Array(1,2,3), Array(1,2,3), Array(2,3,4)). ) and then, I want to get RDD[Array[Int]] return-type value. Is there a method to solve it, whether using map() or filter() function?

Balaji Reddy · Accepted Answer

val baseRdd = sc.parallelize(Array(Array(1, 2, 3), Array(2, 3, 4), Array(1, 2, 1)))

//False specifies desending order 
val result = baseRdd.sortBy(x => x(1), false)

result.foreach { x => println(x(0) + "	" + x(1) + "	" + x(2)) }

Result

2 3 4

1 2 3

1 2 1

how to ascending sort a multiple array of SPARK RDD by any column in scala?

Answers (2)

Related Questions