Reputation: 3387
I am learning parsec, and just encountered the following situation. I want to separate a String
into [String]
by a specific String
; for example, I get "abcSEPdef
, and the separator is "SEP", so, after being parsed, I should get a ["abc","def"]
I believe the parser should look like sepBy a_parser (string "SEP")
; however, I don't know how the a_parser
should be like.
Upvotes: 2
Views: 368
Reputation: 3426
The replace-megaparsec
package has a
sepCap
combinator for splitting strings and capturing the separation.
import Replace.Megaparsec
import Text.Megaparsec
parseTest (sepCap (chunk "SEP" :: Parsec Void String String)) "abcSEPdef"
[Left "abc",Right "SEP",Left "def"]
Upvotes: 0
Reputation: 3387
I finally find a way to incorporate split
package into parsec
:
module Sep where
import Text.ParserCombinators.Parsec
import qualified Data.List.Split as DLS
mysep :: String -> Parser [String]
mysep sep = getInput >>= return . DLS.splitOn sep
Upvotes: 1
Reputation: 52029
Using manyTill
a few times will work:
uptoSEP = manyTill anyChar (eof <|> (string "SEP" >> return ()))
splitSEP = manyTill uptoSEP eof
E.g.:
ghci> parseTest splitSEP "abcSEPdefSEPxyz"
["abc","def","xyz"]
You'll want to enable the {-# LANGUAGE NoMonomorphismRestriction #-}
pragma.
Upvotes: 3
Reputation: 21
Find a negation of "SEP", and let that parser be parseNonSEP. It is theoretically ensured that there parseNonSEP falls under the category of a regular language, because regular languages are closed under negation, and there should be a straigforward way to implement this.
Then,
sepBy pareseNonSEP (string "SEP")
will do the job.
Well, what I mentioned above is a rather theoretical approach :) More parsec-style way may be to look ahead the list of input tokens without actually consuming the input and/or use backtracking such as try, notFollowedBy, lookAhead.
See
http://hackage.haskell.org/package/parsec-3.1.9/docs/Text-Parsec-Combinator.html
Upvotes: 1