user826955
user826955

Reputation: 3206

groupBy on List as LinkedHashMap instead of Map

I am processing XML using scala, and I am converting the XML into my own data structures. Currently, I am using plain Map instances to hold (sub-)elements, however, the order of elements from the XML gets lost this way, and I cannot reproduce the original XML.

Therefore, I want to use LinkedHashMap instances instead of Map, however I am using groupBy on the list of nodes, which creates a Map:

For example:

  def parse(n:Node): Unit = 
  {
    val leaves:Map[String, Seq[XmlItem]] =
      n.child
        .filter(node => { ... })
        .groupBy(_.label)
        .map((tuple:Tuple2[String, Seq[Node]]) =>
        {
          val items = tuple._2.map(node =>
          {
            val attributes = ...

            if (node.text.nonEmpty)
              XmlItem(Some(node.text), attributes)
            else
              XmlItem(None, attributes)
          })

          (tuple._1, items)
        })

      ...
   }

In this example, I want leaves to be of type LinkedHashMap to retain the order of n.child. How can I achieve this?

Note: I am grouping by label/tagname because elements can occur multiple times, and for each label/tagname, I keep a list of elements in my data structures.


Solution
As answered by @jwvh I am using foldLeft as a substitution for groupBy. Also, I decided to go with LinkedHashMap instead of ListMap.

  def parse(n:Node): Unit = 
  {
    val leaves:mutable.LinkedHashMap[String, Seq[XmlItem]] =
      n.child
        .filter(node => { ... })
        .foldLeft(mutable.LinkedHashMap.empty[String, Seq[Node]])((m, sn) =>
        {
          m.update(sn.label, m.getOrElse(sn.label, Seq.empty[Node]) ++ Seq(sn))
          m
        })
        .map((tuple:Tuple2[String, Seq[Node]]) =>
        {
          val items = tuple._2.map(node =>
          {
            val attributes = ...

            if (node.text.nonEmpty)
              XmlItem(Some(node.text), attributes)
            else
              XmlItem(None, attributes)
          })

          (tuple._1, items)
        })

Upvotes: 4

Views: 529

Answers (2)

jwvh
jwvh

Reputation: 51271

To get the rough equivalent to .groupBy() in a ListMap you could fold over your collection. The problem is that ListMap preserves the order of elements as they were appended, not as they were encountered.

import collection.immutable.ListMap

List('a','b','a','c').foldLeft(ListMap.empty[Char,Seq[Char]]){
  case (lm,c) => lm.updated(c, c +: lm.getOrElse(c, Seq()))
}
//res0: ListMap[Char,Seq[Char]] = ListMap(b -> Seq(b), a -> Seq(a, a), c -> Seq(c))

To fix this you can foldRight instead of foldLeft. The result is the original order of elements as encountered (scanning left to right) but in reverse.

List('a','b','a','c').foldRight(ListMap.empty[Char,Seq[Char]]){
  case (c,lm) => lm.updated(c, c +: lm.getOrElse(c, Seq()))
}
//res1: ListMap[Char,Seq[Char]] = ListMap(c -> Seq(c), b -> Seq(b), a -> Seq(a, a))

This isn't necessarily a bad thing since a ListMap is more efficient with last and init ops, O(1), than it is with head and tail ops, O(n).

To process the ListMap in the original left-to-right order you could .toList and .reverse it.

List('a','b','a','c').foldRight(ListMap.empty[Char,Seq[Char]]){
  case (c,lm) => lm.updated(c, c +: lm.getOrElse(c, Seq()))
}.toList.reverse
//res2: List[(Char, Seq[Char])] = List((a,Seq(a, a)), (b,Seq(b)), (c,Seq(c)))

Upvotes: 1

Alexey Romanov
Alexey Romanov

Reputation: 170815

Purely immutable solution would be quite slow. So I'd go with

import collection.mutable.{ArrayBuffer, LinkedHashMap}

implicit class ExtraTraversableOps[A](seq: collection.TraversableOnce[A]) {
  def orderedGroupBy[B](f: A => B): collection.Map[B, collection.Seq[A]] = {
    val map = LinkedHashMap.empty[B, ArrayBuffer[A]]

    for (x <- seq) {
      val key = f(x)
      map.getOrElseUpdate(key, ArrayBuffer.empty) += x
    }

    map
}

To use, just change .groupBy in your code to .orderedGroupBy.

The returned Map can't be mutated using this type (though it can be cast to mutable.Map or to mutable.LinkedHashMap), so it's safe enough for most purposes (and you could create a ListMap from it at the end if really desired).

Upvotes: 1

Related Questions