Luckylukee
Luckylukee

Reputation: 595

Sorting a list as the output of the collect on JavaRDD

I am collecting an RDD and need to sort it in Spark Java API, using the following code:

List<Long> alarmedTimeStamps=sensorDataDoubleDF.toJavaRDD().filter(filterAlarmNode).filter(filterAlarmInstances).map(row->row.getLong(2)).collect();
System.out.println("The type of collection is :"+alarmedTimeStamps.getClass().getTypeName());


ArrayList<Long> javaAlarmedTimeStamps=new ArrayList();
for(Long item:alarmedTimeStamps)
    javaAlarmedTimeStamps.add(item);

//sort the TTS points
Collections.sort(javaAlarmedTimeStamps);

The type of collection is :scala.collection.convert.Wrappers$SeqWrapper

I am using a manual conversion in this code, but I wonder if there is any better way.

Upvotes: 1

Views: 567

Answers (1)

vefthym
vefthym

Reputation: 7462

I believe you are looking for the sortBy method.

Your code could be transformed to:

List<Long> javaAlarmedTimeStamps = sensorDataDoubleDF
.toJavaRDD()
.filter(filterAlarmNode)
.filter(filterAlarmInstances)
.map(row->row.getLong(2))
.sortBy(x->x,true,1) //you could also replace the previous `map` with sortBy(row->row.getLong(2),true,1)
.collect();

By the way, in your code, the reason you cannot sort directly alarmedTimeStamps is because it is immutable. However, you can initialize your copy of alarmedTimeStamps, called javaAlarmedTimeStamps, more easily than you do:

List<Long> javaAlarmedTimeStamps = new ArrayList<>(alarmedTimeStamps);

Upvotes: 1

Related Questions