Chris
Chris

Reputation: 2107

Why does this Iterable produce a Set after mapping?

In the sample code below, why does the Iterable[String] test1 produce a Set after mapping?

val foo = Map("a" -> 1, "b" -> 1)
val test1: Iterable[String] = foo.keys
val test2: Iterator[String] = foo.keys.toIterator

println(test1.map(foo).size) // 1
println(test2.map(foo).size) // 2

I was puzzled by this because its entirely counter-intuitive when reading the code. Even though foo.keys just returns an Iterable, it creates a Set when calling map, as the reflection code shows:

println(test1.map(foo).getClass.getName) // immutable.Set.Set1
println(test2.map(foo).getClass.getName) // Iterator$$anon$11

How does the standard library determine that it should create an immutable.Set here, even though the inferred type of the collection is just Iterable[String]?

Upvotes: 4

Views: 621

Answers (3)

som-snytt
som-snytt

Reputation: 39577

Excavating Kolmar's comment that, although an implicit argument is in play that determines how the result collection is built, in this case the source collection is simply queried for the builder to use.

Iterable.map:

def map[B, That](f: (A) ⇒ B)(implicit bf: CanBuildFrom[Iterable[A], B, That]): That

Implicit scope includes types related to the type args, including Iterable and Int.

Iterable defines a "generic" CanBuildFrom that invokes genericBuilder on the source collection. That's how the result type is tied to the source.

Conversely, the result collection is divorced from the source by taking a CanBuildFrom[From = Nothing, _, _]. This is how cc.to[Set] is expressed, where a Set is built without regard for the source collection cc. For operations such as map, the method collection.breakOut provides such a CanBuildFrom, where the result type can be usefully inferred.

You can inject an arbitrary CanBuildFrom for the desired behavior:

$ scala
Welcome to Scala 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_92).
Type in expressions for evaluation. Or try :help.

scala> val m = Map("a" -> 1, "b" -> 1)
m: scala.collection.immutable.Map[String,Int] = Map(a -> 1, b -> 1)

scala> val k = m.keys
k: Iterable[String] = Set(a, b)

scala> import collection.{generic, mutable}, generic.{CanBuildFrom => CBF}, mutable.ListBuffer
import collection.{generic, mutable}
import generic.{CanBuildFrom=>CBF}
import mutable.ListBuffer

scala>   implicit def `as list`: CBF[Iterable[_], Int, List[Int]] =
     |     new CBF[Iterable[_], Int, List[Int]] {
     |       def apply() = new ListBuffer[Int]
     |       def apply(from: Iterable[_]) = apply()
     |     }
as$u0020list: scala.collection.generic.CanBuildFrom[Iterable[_],Int,List[Int]]

scala> k.map(m)
res0: List[Int] = List(1, 1)

Worth adding that completion can show types as of 2.11.8:

scala> k.map(m) //print<tab>

$line4.$read.$iw.$iw.k.map[Int, Iterable[Int]]($line3.$read.$iw.$iw.m)(scala.collection.Iterable.canBuildFrom[Int]) // : Iterable[Int]

Using breakOut:

scala> k.map(m)(collection.breakOut)
res1: scala.collection.immutable.IndexedSeq[Int] = Vector(1, 1)

scala> k.map(m)(collection.breakOut) //print

$line4.$read.$iw.$iw.k.map[Int, scala.collection.immutable.IndexedSeq[Int]]($line3.$read.$iw.$iw.m)(scala.collection.`package`.breakOut[Any, Int, scala.collection.immutable.IndexedSeq[Int]](scala.Predef.fallbackStringCanBuildFrom[Int])) // : scala.collection.immutable.IndexedSeq[Int]

As shown, it actually picks up the CanBuildFrom intended for operations such as:

scala> "abc".map(_ + 1)
res2: scala.collection.immutable.IndexedSeq[Int] = Vector(98, 99, 100)

scala> "abc".map(_ + 1) //print

scala.Predef.augmentString("abc").map[Int, scala.collection.immutable.IndexedSeq[Int]](((x$1: Char) => x$1.+(1)))(scala.Predef.fallbackStringCanBuildFrom[Int]) // : scala.collection.immutable.IndexedSeq[Int]

Compare:

scala> k.map(m)(collection.breakOut) : List[Int] //print

(($line6.$read.$iw.$iw.k.map[Int, List[Int]]($line5.$read.$iw.$iw.m)(scala.collection.`package`.breakOut[Iterable[String], Int, List[Int]](scala.collection.immutable.List.canBuildFrom[Int]))): scala.`package`.List[scala.Int]) // : List[Int]

The canonical Q&A on breakOut.

Upvotes: 2

dveim
dveim

Reputation: 3482

This is tricky implicit magic. Simplified answer: there exists CanBuildFrom value, that is passed in implicit scope. When compiler is searching for most common type, it looks for implicits in arguments' scope.

In your example, compiler is able to figure out that most common type for foo.keys is Set. This sounds reasonable: Set can be viewed as Map with absent values (also java's HashMap/HashSet do that). When you convert to iterable, implicits are lost, and Set is gone (as side note, those CanBuildFrom hacks are not robust, and might get gone in future, because they really complicate extending of existing collections, you might also want to read this answer and comments).

Scala shares Java's concept of "de-jure and de-facto types". "de-jure" is one declared in method definition, but "de-facto" might be one of inheritors. That's why, for example, you see Map.keys type as Iterable, when de-facto it is Set (produced from Map.keySet, which has Set de-jure type).

Last, 1 in first println is because in underlying map foo all values are the same, and Set(1,1) becomes Set(1).

Upvotes: 0

Kim Stebel
Kim Stebel

Reputation: 42047

foo.keys returns a Set (despite its return type being more general) and calling map on a Set produces another Set. The inferred or compile time type is not always the most precise.

You can see that the keys method on a Set returns a Set even though the return type is Iterable[A]:

scala> Map(1 -> 2).keys
res0: Iterable[Int] = Set(1)

Upvotes: 5

Related Questions