Reputation: 425
I have a series of lists (lets assume the following 3) where the first element all represent a primary key.
var A= List((1,"A"), (2,"B"), (3,"C"))
var B= List((1,"AA"), (2,"BB"), (3,"CC"), (4,"DD"))
var C= List((1,"AAA"), (3,"CCC"))
I would like to full join them together to from a new List such as below. You may assume number of items in the resulting tuples is predetermined to be 4
(1, "A", "AA", "AAA")
(2, "B", "BB", "" )
(3, "C", "CC", "CCC")
(4, "" , "DD", "" )
How can I achieve this in a functional manner and using Scala?
Upvotes: 0
Views: 516
Reputation: 1099
As it is mentioned in the question that "we may assume number of items in the resulting tuples is predetermined to be 4
", the following solution which returns only tuples as requested works:
The lists given are:
var A= List((1,"A"), (2,"B"), (3,"C"))
var B= List((1,"AA"), (2,"BB"), (3,"CC"), (4,"DD"))
var C= List((1,"AAA"), (3,"CCC"))
In Scala REPL:
scala> val list1 = List(A,B,C).flatten
list1: List[(Int, String)] = List((1,A), (2,B), (3,C), (1,AA), (2,BB), (3,CC), (4,DD), (1,AAA), (3,CCC))
scala> val list2 = List(A,B,C).flatten.map(x=>x._2.toArray).flatten.distinct
list2: List[Char] = List(A, B, C, D)
Then using the above two lists
, the required resultList
can be obtained as below:
scala> val resultList =
list2.map(x=>list1.filter(y=>y._2.contains(x))).map{
case List() =>
case List((a,b)) => (a,b,"","")
case List((a,b),(_,c))=>(a,b,c,"")
case List((a,b),(_,c),(_,d)) =>(a,b,c,d)
}
resultList: List[Any] = List((1,A,AA,AAA), (2,B,BB,""), (3,C,CC,CCC), (4,DD,"",""))
But, if we do care about the position of the empty string ""
in each tuple
, the code becomes a bit lengthy as we have to account for all combinations in case statements with if conditions in pattern matching
as below:
scala> val resultList =
list2.map(x=>list1.filter(y=>y._2.contains(x))).map{
case List() =>
case List((a,b)) if(b.size==1) => (a,b,"","")
case List((a,b)) if(b.size==2) => (a,"",b,"")
case List((a,b)) if(b.size==3) => (a,"","",b)
case List((a,b),(_,c)) if(b.size==1 && c.size==2)=>(a,b,c,"")
case List((a,b),(_,c)) if(b.size==2 && c.size==1)=>(a,c,b,"")
case List((a,b),(_,c)) if(b.size==1 && c.size==3)=>(a,b,"",c)
case List((a,b),(_,c)) if(b.size==3 && c.size==1)=>(a,c,"",b)
case List((a,b),(_,c)) if(b.size==2 && c.size==3)=>(a,"",b,c)
case List((a,b),(_,c)) if(b.size==3 && c.size==2)=>(a,"",c,b)
case List((a,b),(_,c),(_,d)) if(b.size==1&&c.size==2 && d.size==3)=>
(a,b,c,d)
case List((a,b),(_,c),(_,d)) if(b.size==1&&c.size==3 && d.size==2)=
(a,b,d,c)
case List((a,b),(_,c),(_,d)) if(b.size==2&&c.size==1&& d.size==3)=>
(a,c,b,d)
case List((a,b),(_,c),(_,d)) if(b.size==2&&c.size==3&& d.size==1)=>
(a,d,b,c)
case List((a,b),(_,c),(_,d)) if(b.size==3&&c.size==1&& d.size==2)=>
(a,c,d,b)
case List((a,b),(_,c),(_,d)) if(b.size==3&&c.size==2&& d.size==1)=>
(a,d,c,b)
}
resultList: List[Any] = List((1,A,AA,AAA), (2,B,BB,""), (3,C,CC,CCC), (4,"",DD,""))
But it should be noted however that while doing such operations using tuples, the type information will be lost and difficult to handle with the resulting tuples list. It may be better going for some other data structures like List etc instead. However, this is solved in view of requirements mentioned in the question.
Upvotes: 0
Reputation: 641
You can use tail recursion to solve
var a= List((1,"A"), (2,"B"), (3,"C"))
var b= List((1,"AA"), (2,"BB"), (3,"CC"), (4,"DD"))
var c= List((1,"AAA"), (3,"CCC"))
val lst: List[List[(Int, String)]] = List(a, b, c)
def fun(input: List[List[(Int, String)]]): List[Any] = {
@tailrec
def itr(acc: List[Any], inp: List[List[(Int, String)]], key: Int, maxKey: Int): List[Any] = {
key match {
case x if x > maxKey => acc
case _ =>
itr(acc ::: List(key :: inp.map(itemLst => {
itemLst.find(_._1 == key).map(_._2).getOrElse("")
})), inp, key + 1, maxKey)
}
}
itr(List(), input, input.head.head._1, input.map(_.length).max)
}
println(fun(lst))
Output is
List(List(1, A, AA, AAA), List(2, B, BB, ), List(3, C, CC, CCC), List(4, , DD, ))
Upvotes: 1
Reputation: 12804
As mentioned in a comment, tuples in Scala are subject to limitations and abstracting over their arity can be cumbersome. In case you wish to do so, you may want to have a look at Shapeless.
For a more straightforward (albeit not very clean) solution, the following will do (with implementations for two different target arities):
val a = List((1,"A"), (2,"B"), (3,"C"))
val b = List((1,"AA"), (2,"BB"), (3,"CC"), (4,"DD"))
val c = List((1,"AAA"), (3,"CCC"))
def join4[K, V](empty: V)(pss: List[(K, V)]*): List[(K, V, V, V)] =
pss.reduceOption(_ ++ _).fold(List.empty[(K, V, V, V)])(_.groupBy(_._1).mapValues(_.map(_._2)).collect {
case (key, Nil) => (key, empty, empty, empty)
case (key, List(a)) => (key, a, empty, empty)
case (key, List(a, b)) => (key, a, b, empty)
case (key, List(a, b, c)) => (key, a, b, c)
case (key, list) => throw new RuntimeException(s"Group for $key is too long (${list.size} > 3)")
}.toList)
def join5[K, V](empty: V)(pss: List[(K, V)]*): List[(K, V, V, V, V)] =
pss.reduceOption(_ ++ _).fold(List.empty[(K, V, V, V, V)])(_.groupBy(_._1).mapValues(_.map(_._2)).collect {
case (key, Nil) => (key, empty, empty, empty, empty)
case (key, List(a)) => (key, a, empty, empty, empty)
case (key, List(a, b)) => (key, a, b, empty, empty)
case (key, List(a, b, c)) => (key, a, b, c, empty)
case (key, List(a, b, c, d)) => (key, a, b, c, d)
case (key, list) => throw new RuntimeException(s"Group for $key is too long (${list.size} > 4)")
}.toList)
join4("")(a, b, c)
join5("")(a, b, c)
You can play with this code on Scastie.
Upvotes: 0
Reputation: 3638
Lets say you are getting an input list such as
var A= List((1,"A"), (2,"B"), (3,"C"))
var B= List((1,"AA"), (2,"BB"), (3,"CC"), (4,"DD"))
var C= List((1,"AAA"), (3,"CCC"))
Then by applying the following function,
List(A,B,C).flatten.groupBy(_._1).map{
case (k,v) => k :: v.map(_._2)
}
You will get an output
res0: scala.collection.immutable.Iterable[List[Any]] = List(List(2, B, BB), List(4, DD), List(1, A, AA, AAA), List(3, C, CC, CCC))
However if you still want to get empty strings in your output, you can try the following
var A= List((1,"A"), (2,"B"), (3,"C"))
var B= List((1,"AA"), (2,"BB"), (3,"CC"), (4,"DD"))
var C= List((1,"AAA"), (3,"CCC"))
val intermediate = List(A,B,C).flatten.groupBy(_._1).map{
case (k,v) => k :: v.map(_._2)
}
val maxSize = intermediate.map(_.size).max
intermediate.map{
x => x.size== maxSize match {
case true =>
x
case false =>
x ::: List.fill(maxSize-x.size)("")
}
}
This fetches you an output
res0: scala.collection.immutable.Iterable[List[Any]] = List(List(2, "B", "BB", ), List(4, "DD", , ), List(1, "A", "AA", "AAA"), List(3, "C", "CC", "CCC"))
Tuples have a performance limitation as well as its size is limited to 22, hence it would be highly advisable to go for lists.
Upvotes: 1