Reputation: 303
Let's say I want to parse a string with various opening and closing brackets (I used parentheses in the title because I believe it is more common -- the question is the same nevertheless) so that I get all the higher levels separated in a list.
Given:
[hello:=[notting],[hill]][3.4(4.56676|5.67787)][the[hill[is[high]]not]]
I want:
List("[hello:=[notting],[hill]]", "[3.4(4.56676|5.67787)]", "[the[hill[is[high]]not]]")
The way I am doing this is by counting the opening and closing brackets and adding to the list whenever I get my counter to 0. However, I have an ugly imperative code. You may assume that the original string is well formed.
My question is: what would be a nice functional approach to this problem?
Notes: I have thought of using the for...yield construct but given the use of the counters I cannot get a simple conditional (I must have conditionals just for updating the counters as well) and I do not know how I could use this construct in this case.
Upvotes: 8
Views: 3741
Reputation: 13221
Quick solution using Scala parser combinator library:
import util.parsing.combinator.RegexParsers
object Parser extends RegexParsers {
lazy val t = "[^\\[\\]\\(\\)]+".r
def paren: Parser[String] =
("(" ~ rep1(t | paren) ~ ")" |
"[" ~ rep1(t | paren) ~ "]") ^^ {
case o ~ l ~ c => (o :: l ::: c :: Nil) mkString ""
}
def all = rep(paren)
def apply(s: String) = parseAll(all, s)
}
Checking it in REPL:
scala> Parser("[hello:=[notting],[hill]][3.4(4.56676|5.67787)][the[hill[is[high]]not]]")
res0: Parser.ParseResult[List[String]] = [1.72] parsed: List([hello:=[notting],[hill]], [3.4(4.56676|5.67787)], [the[hill[is[high]]not]])
Upvotes: 7
Reputation: 51109
You have an ugly imperative solution, so why not make a good-looking one? :)
This is an imperative translation of huynhjl's solution, but just posting to show that sometimes imperative is concise and perhaps easier to follow.
def parse(s: String) = {
var res = Vector[String]()
var depth = 0
var block = ""
for (c <- s) {
block += c
c match {
case '[' => depth += 1
case ']' => depth -= 1
if (depth == 0) {
res :+= block
block = ""
}
case _ =>
}
}
res
}
Upvotes: 2
Reputation: 41646
Given your requirements counting the parenthesis seems perfectly fine. How would you do that in a functional way? You can make the state explicitly passed around.
So first we define our state which accumulates results in blocks
or concatenates the next block
and keeps track of the depth:
case class Parsed(blocks: Vector[String], block: String, depth: Int)
Then we write a pure function that processed that returns the next state. Hopefully, we can just carefully look at this one function and ensure it's correct.
def nextChar(parsed: Parsed, c: Char): Parsed = {
import parsed._
c match {
case '[' | '(' => parsed.copy(block = block + c,
depth = depth + 1)
case ']' | ')' if depth == 1
=> parsed.copy(blocks = blocks :+ (block + c),
block = "",
depth = depth - 1)
case ']' | ')' => parsed.copy(block = block + c,
depth = depth - 1)
case _ => parsed.copy(block = block + c)
}
}
Then we just used a foldLeft
to process the data with an initial state:
val data = "[hello:=[notting],[hill]][3.4(4.56676|5.67787)][the[hill[is[high]]not]]"
val parsed = data.foldLeft(Parsed(Vector(), "", 0))(nextChar)
parsed.blocks foreach println
Which returns:
[hello:=[notting],[hill]]
[3.4(4.56676|5.67787)]
[the[hill[is[high]]not]]
Upvotes: 2
Reputation: 53348
What about:
def split(input: String): List[String] = {
def loop(pos: Int, ends: List[Int], xs: List[String]): List[String] =
if (pos >= 0)
if ((input charAt pos) == ']') loop(pos-1, pos+1 :: ends, xs)
else if ((input charAt pos) == '[')
if (ends.size == 1) loop(pos-1, Nil, input.substring(pos, ends.head) :: xs)
else loop(pos-1, ends.tail, xs)
else loop(pos-1, ends, xs)
else xs
loop(input.length-1, Nil, Nil)
}
scala> val s1 = "[hello:=[notting],[hill]][3.4(4.56676|5.67787)][the[hill[is[high]]not]]"
s1: String = [hello:=[notting],[hill]][3.4(4.56676|5.67787)][the[hill[is[high]]not]]
scala> val s2 = "[f[sad][add]dir][er][p]"
s2: String = [f[sad][add]dir][er][p]
scala> split(s1) foreach println
[hello:=[notting],[hill]]
[3.4(4.56676|5.67787)]
[the[hill[is[high]]not]]
scala> split(s2) foreach println
[f[sad][add]dir]
[er]
[p]
Upvotes: 4
Reputation: 11237
Try this:
val s = "[hello:=[notting],[hill]][3.4(4.56676|5.67787)][the[hill[is[high]]not]]"
s.split("]\\[").toList
returns:
List[String](
[hello:=[notting],[hill],
3.4(4.56676|5.67787),
the[hill[is[high]]not]]
)
Upvotes: 0