Iulia Muntianu
Iulia Muntianu

Reputation: 135

Determining matching parenthesis in Haskell

Assuming I have a string such as:

abc(def(gh)il(mn(01))afg)lmno(sdfg*)

How can I determine the matching bracket for the first one? (meaning (def(gh)il(mn(01))afg))

I have tried to create a between function by counting the number of open brackets until the first ')', but it doesn't work on strings like this one.

Upvotes: 2

Views: 2258

Answers (2)

Stefan Holdermans
Stefan Holdermans

Reputation: 8050

You could use a function that simply traverses the string while keeping track of a stack of indices of opening parentheses. Whenever you encounter a closing parenthesis, you know that it matches with the index at the top of the stack.

For example:

parenPairs :: String -> [(Int, Int)]
parenPairs = go 0 []
  where
    go _ _        []         = []
    go j acc      ('(' : cs) =          go (j + 1) (j : acc) cs
    go j []       (')' : cs) =          go (j + 1) []        cs -- unbalanced parentheses!
    go j (i : is) (')' : cs) = (i, j) : go (j + 1) is        cs
    go j acc      (c   : cs) =          go (j + 1) acc       cs

This function returns a list of all pairs of indices belonging to pairs of matching parentheses.

Applying the function to your example string gives:

> parenPairs "abc(def(gh)il(mn(01))afg)lmno(sdfg*)"
[(7,10),(16,19),(13,20),(3,24),(29,35)]

The opening parenthesis you were interested in appears at index 3. The returned list shows that the matching closing parenthesis is to be found at index 24.

The following functions gives you all properly parenthesised segments of a string:

parenSegs :: String -> [String]
parenSegs s = map (f s) (parenPairs s)
  where
    f s (i, j) = take (j - i + 1) (drop i s)

For example:

> parenSegs "abc(def(gh)il(mn(01))afg)lmno(sdfg*)"
["(gh)","(01)","(mn(01))","(def(gh)il(mn(01))afg)","(sdfg*)"]

Following Frerich Raabe's suggestion, we can now also write a function that only returns the leftmost segment:

firstParenSeg :: String -> String
firstParenSeg s = f s (minimum (parenPairs s))
  where
    f s (i, j) = take (j - i + 1) (drop i s)

For example:

> firstParenSeg "abc(def(gh)il(mn(01))afg)lmno(sdfg*)"
"(def(gh)il(mn(01))afg)"

Note that firstParenSeg will fail if the input string does not contain at least one pair of matching parentheses.

Finally, a small adaption of the parenPairs function lets it fail on unbalanced parentheses:

parenPairs' :: String -> [(Int, Int)]
parenPairs' = go 0 []
  where
    go _ []        []         = []
    go _ (_ : _ )  []         = error "unbalanced parentheses!"
    go j acc       ('(' : cs) =          go (j + 1) (j : acc) cs
    go j []        (')' : cs) = error "unbalanced parentheses!"
    go j (i : is)  (')' : cs) = (i, j) : go (j + 1) is        cs
    go j acc       (c   : cs) =          go (j + 1) acc       cs

Upvotes: 7

Simple newbie solution using helper go function.

brackets :: String -> String
brackets string = go string 0 False
  where go (s:ss) 0 False | s /= '(' = go ss 0 False
        go ('(':ss) 0 False = '(' : go ss 1 True
        go (')':_) 1 True = ")"
        go (')':ss) n True = ')' : go ss (n-1) True
        go ('(':ss) n True = '(' : go ss (n+1) True
        go (s:ss) n flag = s : go ss n flag 
        go "" _ _ = ""

The idea is to remember some counter of opening brackets for each Char. And when that counter will be equal 1 and Char is equal ) - it is time to return the required string.

> brackets "abc(def(gh)il(mn(01))afg)lmno(sdfg*)"
"(def(gh)il(mn(01))afg)"

Note, that this function will return string with unclosed bracket for unbalanced string, like that:

> brackets "a(a(a"
"(a(a"

It could be avoided with another pattern matching condition.


UPD:

More readable solution is balancedSubstring function :: String -> Maybe String that returns Just requires substring if brackets is balanced and Nothing in other cases.

brackets :: String -> String
brackets string = go string 0 False
  where go (s:ss) 0 False | s /= '(' = go ss 0 False
        go ('(':ss) 0 False = '(' : go ss 1 True
        go (')':_) 1 True = ")"
        go (')':ss) n True = ')' : go ss (n-1) True
        go ('(':ss) n True = '(' : go ss (n+1) True
        go (s:ss) n flag = s : go ss n flag
        go "" _  _ = ""

isBalanced :: String -> Bool
isBalanced string = go string 0
  where go ('(':ss) n = go ss (n+1)
        go (')':ss) n | n > 0 = go ss (n-1)
        go (')':_ ) n | n < 1 = False
        go (_:ss) n = go ss n
        go "" 0 = True
        go "" _ = False

balancedSubstring :: String -> Maybe String
balancedSubstring string | isBalanced string = Just $ brackets string
balancedSubstring string | otherwise         = Nothing

So now result of balancedSubstring function is more understandable:

> balancedSubstring "abc(def(gh)il(mn(01))afg)lmno(sdfg*)"
Just "(def(gh)il(mn(01))afg)"

> balancedSubstring "a(a(a"
Nothing

Upvotes: 3

Related Questions