Reputation: 595
I am collecting an RDD and need to sort it in Spark Java API, using the following code:
List<Long> alarmedTimeStamps=sensorDataDoubleDF.toJavaRDD().filter(filterAlarmNode).filter(filterAlarmInstances).map(row->row.getLong(2)).collect();
System.out.println("The type of collection is :"+alarmedTimeStamps.getClass().getTypeName());
ArrayList<Long> javaAlarmedTimeStamps=new ArrayList();
for(Long item:alarmedTimeStamps)
javaAlarmedTimeStamps.add(item);
//sort the TTS points
Collections.sort(javaAlarmedTimeStamps);
The type of collection is :scala.collection.convert.Wrappers$SeqWrapper
I am using a manual conversion in this code, but I wonder if there is any better way.
Upvotes: 1
Views: 567
Reputation: 7462
I believe you are looking for the sortBy
method.
Your code could be transformed to:
List<Long> javaAlarmedTimeStamps = sensorDataDoubleDF
.toJavaRDD()
.filter(filterAlarmNode)
.filter(filterAlarmInstances)
.map(row->row.getLong(2))
.sortBy(x->x,true,1) //you could also replace the previous `map` with sortBy(row->row.getLong(2),true,1)
.collect();
By the way, in your code, the reason you cannot sort directly alarmedTimeStamps
is because it is immutable. However, you can initialize your copy of alarmedTimeStamps
, called javaAlarmedTimeStamps
, more easily than you do:
List<Long> javaAlarmedTimeStamps = new ArrayList<>(alarmedTimeStamps);
Upvotes: 1