Tomer Gabel
Tomer Gabel

Reputation: 4112

Cleaner tuple groupBy

I have a sequence of key-value pairs (String, Int), and I want to group them by key into a sequence of values (i.e. Seq[(String, Int)]) => Map[String, Iterable[Int]])).

Obviously, toMap isn't useful here, and groupBy maintains the values as tuples. The best I managed to come up with is:

val seq: Seq[( String, Int )]
// ...
seq.groupBy( _._1 ).mapValues( _.map( _._2 ) )

Is there a cleaner way of doing this?

Upvotes: 33

Views: 11644

Answers (4)

Luigi Plinge
Luigi Plinge

Reputation: 51109

There's no method or data structure in the standard library to do this, and your solution looks about as concise as you'll get. If you use this in more than one place, you might like to factor it out into a utility method

def groupTuples[A, B](seq: Seq[(A, B)]) = 
  seq groupBy (_._1) mapValues (_ map (_._2))

which you then obviously just call with groupTuples(seq). This might not be the most efficient possible in terms of CPU clock cycles, but I don't think it's particularly inefficient either.

I did a rough benchmark against Jean-Philippe's solution on a list of 9 tuples and this is marginally faster. Both were about twice as fast as folding the sequence into a map (effectively re-implementing groupBy to give the output you want).

Upvotes: 12

Xavier Guihot
Xavier Guihot

Reputation: 61666

Starting Scala 2.13, most collections are provided with the groupMap method which is (as its name suggests) an equivalent (more efficient) of a groupBy followed by mapValues:

List(1 -> 'a', 1 -> 'b', 2 -> 'c').groupMap(_._1)(_._2)
// Map[Int,List[Char]] = Map(2 -> List(c), 1 -> List(a, b))

This:

  • groups elements based on the first part of tuples (Map(2 -> List((2,c)), 1 -> List((1,a), (1,b))))

  • maps grouped values (List((1,a), (1,b))) by taking their second tuple part (List(a, b)).

Upvotes: 3

Jean-Philippe Pellet
Jean-Philippe Pellet

Reputation: 59994

Here's a pimp that adds a toMultiMap method to traversables. Would it solve your problem?

import collection._
import mutable.Builder
import generic.CanBuildFrom

class TraversableOnceExt[CC, A](coll: CC, asTraversable: CC => TraversableOnce[A]) {

  def toMultiMap[T, U, That](implicit ev: A <:< (T, U), cbf: CanBuildFrom[CC, U, That]): immutable.Map[T, That] =
    toMultiMapBy(ev)

  def toMultiMapBy[T, U, That](f: A => (T, U))(implicit cbf: CanBuildFrom[CC, U, That]): immutable.Map[T, That] = {
    val mutMap = mutable.Map.empty[T, mutable.Builder[U, That]]
    for (x <- asTraversable(coll)) {
      val (key, value) = f(x)
      val builder = mutMap.getOrElseUpdate(key, cbf(coll))
      builder += value
    }
    val mapBuilder = immutable.Map.newBuilder[T, That]
    for ((k, v) <- mutMap)
      mapBuilder += ((k, v.result))
    mapBuilder.result
  }
}

implicit def commomExtendTraversable[A, C[A] <: TraversableOnce[A]](coll: C[A]): TraversableOnceExt[C[A], A] =
  new TraversableOnceExt[C[A], A](coll, identity)

Which can be used like this:

val map = List(1 -> 'a', 1 -> 'à', 2 -> 'b').toMultiMap
println(map)  // Map(1 -> List(a, à), 2 -> List(b))

val byFirstLetter = Set("abc", "aeiou", "cdef").toMultiMapBy(elem => (elem.head, elem))
println(byFirstLetter) // Map(c -> Set(cdef), a -> Set(abc, aeiou))

If you add the following implicit defs, it will also work with collection-like objects such as Strings and Arrays:

implicit def commomExtendStringTraversable(string: String): TraversableOnceExt[String, Char] =
  new TraversableOnceExt[String, Char](string, implicitly)

implicit def commomExtendArrayTraversable[A](array: Array[A]): TraversableOnceExt[Array[A], A] =
  new TraversableOnceExt[Array[A], A](array, implicitly)

Then:

val withArrays = Array(1 -> 'a', 1 -> 'à', 2 -> 'b').toMultiMap
println(withArrays) // Map(1 -> [C@377653ae, 2 -> [C@396fe0f4)

val byLowercaseCode = "Mama".toMultiMapBy(c => (c.toLower.toInt, c))
println(byLowercaseCode) // Map(97 -> aa, 109 -> Mm)

Upvotes: 19

Johnny Everson
Johnny Everson

Reputation: 8601

I don't know if you consider it cleaner:

seq.groupBy(_._1).map { case (k,v) => (k,v.map(_._2))}

Upvotes: 8

Related Questions