Alok
Alok

Reputation: 542

Merging nested list of objects into a single list

I have a list of objects, one of which is another list (or actually, Seq[Row] - these are RDDs) and I want to merge them together. There is a list with rows <a, b, c, d> where d is itself another list <q, r, s, t>, one of which is another nested list but let's ignore that for simplicity. I want to change this into a list of <a, b, c, q1, r1, s1, t1>, <a, b, c, q2, r2, s2, t2> ...

I can extract the information into case classes etc. and then put them together, but I feel there should be a way to use zip and map etc. to write this in a better functional manner, how should I do so?

Edit Detailed description:

The lists are from nested RDD table on hdfs.

parent: <Long, String, String, Long, String, Float, Seq[Row] foolist >
foolist: <String, String, Long, Int, Seq[Row] barlist >
barlist: <String, Boolean, Int, Long, Seq[Row] list1, Seq[Row] list2 >

They have more fields than stated. Other than the parent object I don't need to filter out any fields in the final result, in which a single row in the parent would become a collection of the values in

{parent row}, {foo row 1}, {barlist row 1}
{parent row}, {foo row 1}, {barlist row 2}
{parent row}, {foo row 1}, {barlist row N}
{parent row}, {foo row 2}, {barlist row 1}
{parent row}, {foo row 2}, {barlist row N}
...
{parent row}, {foo row M}, {barlist row N}

which are not tuples, just a plain list of fields (Long, String, String, Long, String, Float, String, String, Long, Int, String, Boolean, Int, Long ..)

Upvotes: 0

Views: 1174

Answers (2)

Alexandr Dorokhin
Alexandr Dorokhin

Reputation: 850

You can use this:

def flat(seq: Seq[Any]):Seq[Any] = seq flatMap {
  case sq:Seq[_] => flat(sq)
  case x => Seq(x)
}

Edit: if you want to flatten row, you can try:

def flat(seq: Seq[Any]):Seq[Any] = seq flatMap {
  case Seq(row) => flat(row.toSeq)
  case x => Seq(x)
}

Upvotes: 2

Nyavro
Nyavro

Reputation: 8866

You can use flatMap for that:

seq.flatMap {case (a,b,c,d) => d.map {case (q,r,s,t) => (a,b,c,q,r,s,t)}}

Or:

val res = for {
  (a,b,c,d) <- seq;
  (q,r,s,t) <- d
} yield (a,b,c,q,r,s,t)

Upvotes: 2

Related Questions