removing the some in left join RDD in spark

Question

I'm running a left join in a Spark RDD but sometimes I get an output like this:

(k, (v, Some(w)))

or

(k, (v, None))

how do I make it so it give me back just

(k, (v, (w)))

or

(k, (v, ()))

here is how I'm combining 2 files..

def formatMap3(
    left: String = "", right: String = "")(m: String = "") = {
  val items = m.map{k => {
   s"$k"}}
  s"$left$items$right"
}



val combPrdGrp = custPrdGrp3.leftOuterJoin(cmpgnPrdGrp3)

val combPrdGrp2 = combPrdGrp.groupByKey

val combPrdGrp3 = combPrdGrp2.map { case (n, list) => 
  val formattedPairs = list.map { case (a, b) => s"$a $b" }
  s"$n ${formattedPairs.mkString}"
}

Jason Scott Lenderman · Accepted Answer

If you're just interesting in getting formatted output without the Somes/Nones, then something like this should work:

val combPrdGrp3 = combPrdGrp2.map { case (n, list) => 
  val formattedPairs = list.map { 
    case (a, Some(b)) => s"$a $b" 
    case (a, None) => s"$a, ()" 
  }
  s"$n ${formattedPairs.mkString}"
}

If you have other uses in mind then you probably need to provide more details.

removing the some in left join RDD in spark

Answers (2)

Related Questions