Reputation: 1857
I want to parse assembly programs. I have a fixed format for parsing an assembly address: [ register + offset + label ]
I implemented parsers for registers, offsets and labels. Now I want to create a parser which parses the whole address.
The combinations I want to accept:
[register]
[offset]
[label]
[register + offset]
[register + label]
[offset + label]
[register + offset + label]
And what I don't want to accept:
[]
[register offset]
[register + ]
...
Of course the simple solution is to have something like:
choice $ try (parseRegister >>= \r -> Address (Just r) Nothing Nothing)
<|> try ...
But it is ugly and does not scale well with more types of elements. So I'm looking for a cleaner solution.
Upvotes: 3
Views: 404
Reputation: 13243
I've been looking for something like that and found
Control.Applicative.Permutation
from action-permutations
. Though my case may scale independently from low-level platform.
In your case might look like
operand = do
(r, o, l) <- runPermsSep (char '+') $ (,,)
<$> maybeAtom register
<*> maybeAtom offset
<*> maybeAtom label
-- backtrack on inappropriate combination
when (null $ catMaybes [r, o, l]) . fail $ "operand expected"
return (r, o, l)
Note that you actually want optional permutation parser that requires at least one optional element to be present which makes your wanted parsers combinator pretty specific.
Upvotes: 1
Reputation: 55049
If you reorder your table, you see it’s a series of choices:
[register + offset + label]
[register + offset ]
[register + label]
[register ]
[ offset + label]
[ offset ]
[ label]
The grammar for which might be written:
address = '[' (register ('+' offset-label)? | offset-label) ']'
offset-label = offset ('+' label)? | label
Which in Applicative style is pretty straightforward, made only slightly noisy by wrapping everything in constructors:
parseAddress :: Parser Address
parseAddress = do
(register, (offset, label)) <- between (char '[') (char ']') parseRegisterOffsetLabel
return $ Address register offset label
parseRegisterOffsetLabel :: Parser (Maybe Register, (Maybe Offset, Maybe Label))
parseRegisterOffsetLabel = choice
[ (,)
<$> (Just <$> parseRegister)
<*> option (Nothing, Nothing) (char '+' *> parseOffsetLabel)
, (,) Nothing <$> parseOffsetLabel
]
parseOffsetLabel :: Parser (Maybe Offset, Maybe Label)
parseOffsetLabel = choice
[ (,)
<$> (Just <$> parseOffset)
<*> option Nothing (char '+' *> (Just <$> parseLabel))
, (,) Nothing . Just <$> parseLabel
]
If we add a couple of utility functions:
plus :: Parser a -> Parser a
plus x = char '+' *> x
just :: Parser a -> Parser (Maybe a)
just = fmap Just
We can clean up these implementations a bit:
parseRegisterOffsetLabel = choice
[ (,)
<$> just parseRegister
<*> option (Nothing, Nothing) (plus parseOffsetLabel)
, (,) Nothing <$> parseOffsetLabel
]
parseOffsetLabel = choice
[ (,)
<$> just parseOffset
<*> option Nothing (plus (just parseLabel))
, (,) Nothing <$> just parseLabel
]
Then factor out the repetition, giving us a decent final solution:
parseChain begin def rest = choice
[ (,) <$> just begin <*> option def (plus rest)
, (,) Nothing <$> rest
]
parseRegisterOffsetLabel = parseChain
parseRegister (Nothing, Nothing) parseOffsetLabel
parseOffsetLabel = parseChain
parseOffset Nothing (just parseLabel)
I’ll let you take care of whitespace around +
and inside []
.
Upvotes: 2
Reputation: 1622
You could have more elegant solution using Monoids
and sepBy1
.
But it allows to write [register + register]
(in our case adding them both)
parsePlus = many1 (char ' ') >> char '+' >> many1 (char ' ')
parseAddress1 =
try parseRegister
<|> parseOffset
<|> parseLabel
parseAddress = sepBy1 parsePlus parseAddress1 >>= return . mconcat
instance Monoid Address where
mempty = Address Nothing Nothing Nothing
Address r o l `mappend` Address r' o' l' =
Address (r `mappendA` r') (o `mappendA` o') (l `mappendA` l')
where
a `mappendA` a' = fmap getSum $ fmap Sum a `mappend` fmap Sum a'
Choosing Monoid (Sum a
, First a
, Last a
) for r
o
l
, we change the behavior:
Sum
adds each other, First
chooses first one, Last
chooses the last one
... where
a `mappendA` a' = getFirst $ First a `mappend` First a'
Upvotes: 0
Reputation: 1622
Something like that:
parsePlus = many1 (char ' ') >> char '+' >> many1 (char ' ')
parseRegisterModified = parsePlus >> parseOffsetLabel
parseOffsetModified = parsePlus >> parseLabel
parseRegister' = do
Address r _ _ <- parseRegister
optionMaybe parseRegisterModified >>=
return $ maybe
(Address r Nothing Nothing)
(\Address _ o l -> Address r o l)
parseOffset' = do
Address _ o _ <- parseOffset
optionMaybe parseOffsetModified >>=
return $ maybe
(Address Nothing o Nothing)
(\Address _ _ l -> Address Nothing o l)
parseOffsetLabel = try parseOffset' <|> parseLabel
parseAddress =
try parseRegister'
<|> parseOffset'
<|> parseLabel
Upvotes: 1