Reputation: 15
What is the general accepted implementation of a recursive split-by-token function?
For example:
val s: List[Char] = 'F' :: 'o' :: 'o' :: ' ' :: 'b' :: 'a' :: 'r' :: ' ' :: 'f' :: 'o':: 'o':: ' ' :: 'b' :: 'a' :: 'r' :: Nil
fooSplit(s) // -> List(List('F','o','o'), List('b', 'a', 'r'), List('f', 'o', 'o'), List('b','a','r'))
Here are the base cases I've come up with, when doing pattern matching on the parameter:
def fooSplit(text: List[Char]): List[List[Char]] = text match {
case x :: ' ' :: xs =>
// Somehow append old list to fooSplit(xs) <?>
println(x + "_")
fooSplit(xs)
case x :: xs =>
println(x + "")
fooSplit(xs)
case Nil => Nil
}
Most of my attempts following the above structure almost always lead to List('F', List('O', List('O'), ...)...)
. Without coming up with some very sinister double pattern matching I can't seem to wrap my head on to extend instead of append to the inside list.
I would also like to quote the Scala Cookbook:
Use flatMap in situations where you run map followed by flatten. The specific situation is this:
• You’re using map (or a for/yield expression) to create a new collection from an existing collection.
• The resulting collection is a list of lists.
• You call flatten immediately after map (or a for/yield expression)
This seems like a very possible case to use flatMap, could anyone provide insight on this?
Upvotes: 0
Views: 160
Reputation: 3173
Just because you're returning an instance of List[List[String]]
doesn't mean you need to flatten it, sometimes you might need those values. I would suggest this:
def fooSplit(text: List[Char]): List[List[Char]] = {
@tailrec
def fooSplitHelper(currentListBeforeToken: List[Char] = List.empty, list: List[Char], acc: List[List[Char]] = List.empty): List[List[Char]] = {
if (list.isEmpty) {
currentListBeforeToken.reverse :: acc
} else {
list.head match {
case ' ' => fooSplitHelper(List.empty[Char], list.tail, currentListBeforeToken.reverse :: acc)
case otherChar => fooSplitHelper(otherChar :: currentListBeforeToken, list.tail, acc)
}
}
}
fooSplitHelper(list = text).reverse
}
Upvotes: 1
Reputation: 40500
One crucial realization to keep this function from becoming "sinister" is that you can hold the current sublist being built as its own variable, and only add it to the result when it is complete.
From there, it's pretty simple.
There are basically three cases: end of list (done!), current element is space (add intermediate to result, and start with empty again), everything else (just keep adding to intermediate), so you just spell them out:
@tailrec
def split(
in: List[Char],
next: List[Char] = Nil,
out: List[List[Char]] = Nil
): List[List[Char]] = in match {
case Nil => out.map(_.reverse).reverse
case ' ' :: tail => split(tail, Nil, next :: out)
case head :: tail => split(tail, head :: next, out)
}
A couple of common tricks here are that you pass the actual result you are building down to the recursive calls, rather than just returning it (that way the function can be optimized as tail-recursive - doesn't need space on stack), and also, you build your lists in reverse direction, prepending elements to them, rather than appending, and then reverse at the end, when the list is complete (this allows implementation to remain linear, because prepending to a list is constant time, unlike append, which is O(N), making the whole thing quadratic (yuck!)).
Upvotes: 4