Grouping keys based on values in scala map

Question

Suppose I've a scala map like this -

Map[String, List[String]] = Map(
"A"->List(1,2,3),
"B"->List(2,4),
"C"->List(4,5),
"D"->List(7,8,9),
"E"->List(8,9),
"F"->List(4,5,7),
"G"->List(1,3,2)
)

I want to group the keys such that their intersection is not empty. For example the result should be like -

Map(
List("A","B","G")->List(2),
List("C","F")->List(4,5),
List("D","E")->List(8,9)
)

Note that there can be multiple outputs for eg there can be List("B","C")->List(4). I just need to select anyone as long as all the keys are present AND the keys should not be repeated, for eg - 'B' and 'C' are present in the map already so later even if we still found any pair for intersection is not empty, for instance we found 'B' and 'C', we should NOT add it. My way to approach this problem includes converting list of values to a set and then iterate through all entries of map such that it keeps on intersecting its value with the next value and if intersection is not empty, we will group the keys.

But I want to know if there is any other way to do this.

EDIT -

The point is the key list should be the large and its value should have at least one element.

jwvh · Accepted Answer

I turned the Map's values into List[Int] just to make it easier to load into the IDE. That can be easily undone.

val m :Map[String, List[Int]] = Map(
  "A"->List(1,2,3),
  "B"->List(2,4),
  "C"->List(4,5),
  "D"->List(7,8,9),
  "E"->List(8,9),
  "F"->List(4,5,7),
  "G"->List(1,3,2))

Then I fold over the keys to see which ones intersect.

m.keys.foldLeft(List.empty[(List[String],List[Int])]){case (acc,k) =>
  val (not, has) = acc.partition(_._2.intersect(m(k)).isEmpty)
  has.headOption.fold((List(k),m(k)) :: acc){case (s,v) =>
    (k::s, v.intersect(m(k))) :: has.tail ::: not
  }
}.toMap
//res0: Map[List[String],List[Int]] = Map(List(D, E) -> List(8, 9)
//                                       ,List(C, F) -> List(4, 5)
//                                       ,List(B, G, A) -> List(2))

It's a bit of a "brute force" approach, getting the intersection once for the partition() and again when building the new entry to the accumulator, and the original Map m is dereferenced multiple times for each k value.

explanation

From the Map's keys I'm building a List of (List[String],List[Int]) tuples. For each Map key, the accumulated list is partitioned between those elements where there is an intersection with the new key's value (has), and those where there is no intersection (not).

I grab only the 1st element from the has group. If there is none then this key, and its value, are prepended to the result, else the key is prepended to the has.head key list and the value list is adjusted to reflect the current intersection. That new element is prepended to the has.tail (the intersections being ignored) and combined with everything in the not list.

Grouping keys based on values in scala map

Answers (2)

Related Questions