Michael Biniashvili
Michael Biniashvili

Reputation: 610

Merging list of tuples in scala based on key

I have a list of tuples look like this:

Seq("ptxt"->"how","list"->"you doing","ptxt"->"whats up","ptxt"-> "this ","list"->"is ","list"->"cool")

On the keys, merge ptxt with all the list that will come after it. e.g. create a new seq look like this :

Seq("how you doing", "whats up", "this is cool")

Upvotes: 0

Views: 597

Answers (4)

Krzysztof Atłasik
Krzysztof Atłasik

Reputation: 22595

You could fold your Seq with foldLeft:

val s = Seq("ptxt"->"how ","list"->"you doing","ptxt"->"whats up","ptxt"-> "this ","list"->"is ","list"->"cool")

val r: Seq[String] = s.foldLeft(List[String]()) {
  case (xs, ("ptxt", s)) => s :: xs
  case (x :: xs, ("list", s)) => (x + s) :: xs
}.reverse

If you don't care about an order you can omit reverse.


Function foldLeft takes two arguments first is the initial value and the second one is a function taking two arguments: the previous result and element of the sequence. Result of this method is then fed the next function call as the first argument.

For example for numbers foldLeft, would just create a sum of all elements starting from left.

List(5, 4, 8, 6, 2).foldLeft(0) { (result, i) =>
  result + i
} // 25

For our case, we start with an empty list. Then we provide function, which handles two cases using pattern matching.

  • Case when the key is "ptxt". In this case, we just prepend the value to list.

    case (xs, ("ptxt", s)) => s :: xs
    
  • Case when the key is "list". Here we take the first string from the list (using pattern matching) and then concatenate value to it, after that we put it back with the rest of the list.

    case (x :: xs, ("list", s)) => (x + s) :: xs
    

At the end since we were prepending element, we need to revert our list. Why we were prepending, not appending? Because append on the immutable list is O(n) and prepend is O(1), so it's more efficient.

Upvotes: 8

pme
pme

Reputation: 14803

Here another solution:

val data = Seq("ptxt"->"how","list"->"you doing","ptxt"->"whats", "list" -> "up","ptxt"-> "this ", "list"->"is cool")

First group Keys and Values:

val grouped = s.groupBy(_._1)
               .map{case (k, l) => k -> l.map{case (_, v) => v.trim}}

// > Map(list -> List(you doing, up, is cool), ptxt -> List(how, whats, this))

Then zip and concatenate the two values:

grouped("ptxt").zip(grouped("list"))
    .map{case (a, b) => s"$a $b"}

// > List(how you doing, whats up, this is cool)

Disclaimer: This only works if the there is always key, value, key, value,.. in the list - I had to adjust the input data.

Upvotes: 3

Sachin
Sachin

Reputation: 102

Adding another answer since I don't have enough reputation points for adding a comment. just an improvment on Krzysztof Atłasik's answer. to compensate for the case where the Seq starts with a "list" you might want to add another case as:

  case (xs,("list", s)) if xs.isEmpty=>xs

So the final code could be something like:

val s = Seq("list"->"how ","list"->"you doing","ptxt"->"whats up","ptxt"-> "this ","list"->"is ","list"->"cool")

val r: Seq[String] = s.foldLeft(List[String]()) {
  case (xs,("list", s)) if xs.isEmpty=>xs
  case (xs, ("ptxt", s)) => s :: xs
  case (x :: xs, ("list", s)) => (x + s) :: xs
}.reverse

Upvotes: 1

If you change Seq for List, you can solve that with a simple tail-recursive function.
(The code uses Scala 2.13, but can be rewritten to use older Scala versions if needed)

def mergeByKey[K](list: List[(K, String)]): List[String] = {
  @annotation.tailrec
  def loop(remaining: List[(K, String)], acc: Map[K, StringBuilder]): List[String] =
    remaining match {
      case Nil =>
        acc.valuesIterator.map(_.result()).toList

      case (key, value) :: tail =>
        loop(
          remaining = tail,
          acc.updatedWith(key) {
            case None           => Some(new StringBuilder(value))
            case Some(oldValue) => Some(oldValue.append(value))
          }
        )
    }
  loop(remaining = list, acc = Map.empty)
}

val data = List("ptxt"->"how","list"->"you doing","ptxt"->"whats up","ptxt"-> "this ","list"->"is ","list"->"cool")
mergeByKey(data)
// res: List[String] = List("howwhats upthis ", "you doingis cool")

Or a one liner using groupMap.
(inspired on pme's answer)

data.groupMap(_._1)(_._2).view.mapValues(_.mkString).valuesIterator.toList

Upvotes: 2

Related Questions