Reputation: 671
I am trying to separate a string using a delimiter consisting of multiple characters, but the problem is that each of those characters can occur by itself in non-delimiting string. For example, I have foo*X*bar*X*baz
, where the delimiter is *X*
, so I want to get [foo, bar, baz]
, but each one of those can contain *
or X
.
I have tried
sepBy (many anyChar) delimiter
but that just swallows the whole string, giving "foo*X*bar*X*baz"
, if I do
sepBy anyChar (optional delimiter)
it filters out the delimiters correctly, but doesn't partition the list, returning "foobarbaz"
. I don't know which other combination I could try.
Upvotes: 2
Views: 1051
Reputation: 3791
Perhaps you want something like this,
tok = (:) <$> anyToken <*> manyTill anyChar (try (() <$ string sep) <|> eof)
The anyToken
prevents us from looping forever at the end of input, the try
lets us avoid being over-eager in consuming separator characters.
Full code for a test,
module ParsecTest where
import Control.Applicative ((<$), (<$>), (<*>))
import Data.List (intercalate)
import Text.Parsec
import Text.Parsec.String
sep,msg :: String
sep = "*X*"
msg = intercalate "*X*" ["foXo", "ba*Xr", "bX*az"]
tok :: Parser String
tok = (:) <$> anyToken <*> manyTill anyChar (try (() <$ string sep) <|> eof)
toks :: Parser [String]
toks = many tok
test :: Either ParseError [String]
test = runP toks () "" msg
Upvotes: 3