Reputation: 41646
This inspiration for this question came when I tried to answer this one.
Say you have a sequence of data (may be from a CSV file for instance). groupBy
can be used to analyze certain aspect of the data, grouping by column or a combination of columns. For instance:
val groups0: Map[String, Array[String]] =
seq.groupBy(row => row(0) + "-" + row(4))
If I then want to create sub-groups within the groups I can do
val groups1: Map[String, Map[String, Array[String]]] =
groups0.mapValues(row => row.groupBy(_(1))
If I want to do this more one time it gets really cumbersome:
val groups2 =
groups1.mapValues(groups => groups.mapValues(row => row.groupBy(_(2)))
So here is my question given an arbitrary nesting of Map[K0, Map[K1, ..., Map[Kn, V]]]
, how do you write a mapValues
function that takes a f: (V) => B
and applies to the innermost V
to return a Map[K0, Map[K1, ..., Map[Kn, B]]]
?
Upvotes: 2
Views: 1269
Reputation: 95604
I'm sure there is a better way using Manifest
, but pattern matching seems to distinguish Seq
and Map
, so here it is:
object Foo {
def mapValues[A <: Map[_, _], C, D](map: A)(f: C => D): Map[_, _] = map.mapValues {
case seq: Seq[C] => seq.groupBy(f)
case innerMap: Map[_, _] => mapValues(innerMap)(f)
}
}
scala> val group0 = List("fooo", "bar", "foo") groupBy (_(0))
group0: scala.collection.immutable.Map[Char,List[java.lang.String]] = Map((f,List(fooo, foo)), (b,List(bar)))
scala> val group1 = Foo.mapValues(group0)((x: String) => x(1))
group1: scala.collection.immutable.Map[_, Any] = Map((f,Map(o -> List(fooo, foo))), (b,Map(a -> List(bar))))
scala> val group2 = Foo.mapValues(group1)((x: String) => x(2))
group2: scala.collection.immutable.Map[_, Any] = Map((f,Map(o -> Map(o -> List(fooo, foo)))), (b,Map(a -> Map(r -> List(bar)))))
Edit: Here's a typed version using higher-kinded type.
trait NestedMapValue[Z] {
type Next[X] <: NestedMapValue[Z]
def nextValues[D](f: Z => D): Next[D]
}
trait NestedMap[Z, A, B <: NestedMapValue[Z]] extends NestedMapValue[Z] { self =>
type Next[D] = NestedMap[Z, A, B#Next[D]]
val map: Map[A, B]
def nextValues[D](f: Z => D): Next[D] = self.mapValues(f)
def mapValues[D](f: Z => D): NestedMap[Z, A, B#Next[D]] = new NestedMap[Z, A, B#Next[D]] { val map = self.map.mapValues {
case x: B => x.nextValues[D](f)
}}
override def toString = "NestedMap(%s)" format (map.toString)
}
trait Bottom[A] extends NestedMapValue[A] {
type Next[D] = NestedMap[A, D, Bottom[A]]
val seq: Seq[A]
def nextValues[D](f: A => D): Next[D] = seq match {
case seq: Seq[A] => groupBy[D](f)
}
def groupBy[D](f: A => D): Next[D] = seq match {
case seq: Seq[A] =>
new NestedMap[A, D, Bottom[A]] { val map = seq.groupBy(f).map { case (key, value) => (key, new Bottom[A] { val seq = value })} }
}
override def toString = "Bottom(%s)" format (seq.toString)
}
object Bottom {
def apply[A](aSeq: Seq[A]) = new Bottom[A] { val seq = aSeq }
}
scala> val group0 = Bottom(List("fooo", "bar", "foo")).groupBy(x => x(0))
group0: NestedMap[java.lang.String,Char,Bottom[java.lang.String]] = NestedMap(Map(f -> Bottom(List(fooo, foo)), b -> Bottom(List(bar))))
scala> val group1 = group0.mapValues(x => x(1))
group1: NestedMap[java.lang.String,Char,Bottom[java.lang.String]#Next[Char]] = NestedMap(Map(f -> NestedMap(Map(o -> Bottom(List(fooo, foo)))), b -> NestedMap(Map(a -> Bottom(List(bar))))))
scala> val group2 = group1.mapValues(x => x.size)
group2: NestedMap[java.lang.String,Char,Bottom[java.lang.String]#Next[Char]#Next[Int]] = NestedMap(Map(f -> NestedMap(Map(o -> NestedMap(Map(4 -> Bottom(List(fooo)), 3 -> Bottom(List(foo)))))), b -> NestedMap(Map(a -> NestedMap(Map(3 -> Bottom(List(bar))))))))
Upvotes: 3
Reputation: 216
My first instinct said that handling arbitrary nesting in a type-safe way would be impossible, but it seems that it IS possible if you define a few implicits that tell the compiler how to do it.
Essentially, the "simple" mapper tells it how to handle the plain non-nested case, while "wrappedMapper" tells it how to drill down through one Map layer:
// trait to tell us how to map inside of a container.
trait CanMapInner[WrappedV, WrappedB,V,B] {
def mapInner(in: WrappedV, f: V => B): WrappedB
}
// simple base case (no nesting involved).
implicit def getSimpleMapper[V,B] = new CanMapInner[V,B,V,B] {
def mapInner(in: V, f: (V) => B): B = f(in)
}
// drill down one level of "Map".
implicit def wrappedMapper[K,V,B,InnerV,InnerB]
(implicit innerMapper: CanMapInner[InnerV,InnerB,V,B]) =
new CanMapInner[Map[K,InnerV], Map[K,InnerB],V,B] {
def mapInner(in: Map[K, InnerV], f: (V) => B): Map[K, InnerB] =
in.mapValues(innerMapper.mapInner(_, f))
}
// the actual implementation.
def deepMapValues[K,V,B,WrappedV,WrappedB](map: Map[K,WrappedV], f: V => B)
(implicit mapper: CanMapInner[WrappedV,WrappedB,V,B]) = {
map.mapValues(inner => mapper.mapInner(inner, f))
}
// testing with a simple map
{
val initMap = Map(1 -> "Hello", 2 -> "Goodbye")
val newMap = deepMapValues(initMap, (s: String) => s.length)
println(newMap) // Map(1 -> 5, 2 -> 7)
}
// testing with a nested map
{
val initMap = Map(1 -> Map("Hi" -> "Hello"), 2 -> Map("Bye" -> "Goodbye"))
val newMap = deepMapValues(initMap, (s: String) => s.length)
println(newMap) // Map(1 -> Map(Hi -> 5), 2 -> Map(Bye -> 7))
}
Of course, in real code the pattern-matching dynamic solution is awfully tempting thanks to its simplicity. Type-safety isn't everything :)
Upvotes: 5