Knight71
Knight71

Reputation: 2949

Custom ordering for sorting in rdd spark

I have an rdd of (String, Long, Int, String, String,List[Integer], String, String, String, Long, Long) . I want to sort it by all the fields in tuple that is if the _._1 is equal move to _._2 or else return the result of the first comparision. Like this I want to continue till the last element in tuple.

The below solution looks too clumsy. Is there a better way to do this in scala ?

What I am trying is like this

val customOrdering = new Ordering[(String, Long, Int, String, String,
      List[Integer], String, String, String, Long, Long)] {
      override def compare(a: (String, Long, Int, String, String,
        List[Integer], String, String, String, Long, Long),
                           b: (String, Long, Int, String, String,
                             List[Integer], String, String, String, Long, Long)) = {


        if (a._1.compare(b._1) == 0) {
          if(a._2 == a._2){
            ...

          }
          else if(a._2 < a._2) {
              1
            }
          else{
            0
          }
        }
        else if(a._1.compare(b._1) < 0 ){
          1
        }
        else {
          0
        }
      }
    }

Upvotes: 0

Views: 1039

Answers (1)

rhernando
rhernando

Reputation: 1071

I'd try an approach by converting the tuples to sequences, zipping them and then get the first item whose comparation is non-zero.

It would be something like:

first.productIterator.toSeq.zip(second.productIterator.toSeq).find(
  {case (x, y) => x.compare(y) != 0}
) match {
  case Some(tuple) => {
     if (tuple._1.compare(tuple._2) < 0) 1 else 0
  }
  case None => ???
}

Upvotes: 1

Related Questions