Reputation: 1372
I'm trying to implement a date printing and parsing system using Parsec.
I have successfully implemented a printing function of type
showDate :: String -> Date -> Parser String
It takes parses a formatting string and creates a new string based on the tokens that the formatted string presented.
For example
showDate "%d-%m-%Y" $ Date 2015 3 17
has the output Right "17-3-2015"
I already wrote a tokenizer to use in the showDate
function, so I thought that I could just use the output of that to somehow generate a parser using the function readDate :: [Token] -> Parser Date
. My idea quickly came to a halt as I realised I had no idea how to implement this.
Assume we have the following functions and types (the implementation doesn't matter):
data Token = DayNumber | Year | MonthNumber | DayOrdinal | Literal String
-- Parses four digits and returns an integer
pYear :: Parser Integer
-- Parses two digits and returns an integer
pMonthNum :: Parser Int
-- Parses two digits and returns an integer
pDayNum :: Parser Int
-- Parses two digits and an ordinal suffix and returns an integer
pDayOrd :: Parser Int
-- Parses a string literal
pLiteral :: String -> Parser String
The parser readDate [DayNumber,Literal "-",MonthNumber,Literal "-",Year]
should be equivalent to
do
d <- pDayNum
pLiteral "-"
m <- pMonthNum
pLiteral "-"
y <- pYear
return $ Date y m d
Similarly, the parser readDate [Literal "~~", MonthNumber,Literal "hello",DayNumber,Literal " ",Year]
should be equivalent to
do
pLiteral "~~"
m <- pMonthNum
pLiteral "hello"
d <- pDayNum
pLiteral " "
y <- pYear
return $ Date y m d
My intuition suggests there's some kind of concat/map/fold using monad bindings that I can use for this, but I have no idea.
Is parsec the right tool for this?
Is my approach convoluted or ineffective?
If not, how do I achieve this functionality?
If so, what should I try to do instead?
Upvotes: 3
Views: 284
Reputation: 24156
Your Token
s are instructions in a small little language for date formats [Token]
.
import Data.Functor
import Text.Parsec
import Text.Parsec.String
data Date = Date Int Int Int deriving (Show)
data Token = DayNumber | Year | MonthNumber | Literal String
In order to interpret this language, we need a type that represents the state of the interpreter. We start off not knowing any of the components of the Date
and then discover them as we encounter DayNumber
, Year
, or MonthNumber
. The following DateState
represents the state of knowing or not knowing each of the components of the Date
.
data DateState = DateState {dayState :: (Maybe Int), monthState :: (Maybe Int), yearState :: (Maybe Int)}
We will start interpreting a [Token]
with DateState Nothing Nothing Nothing
.
Each Token
will be converted into a function that reads the DateState
and produces a parser that computes the new DateState
.
readDateToken :: Token -> DateState -> Parser DateState
readDateToken (DayNumber) ds =
do
day <- pNatural
return ds {dayState = Just day}
readDateToken (MonthNumber) ds =
do
month <- pNatural
return ds {monthState = Just month}
readDateToken (Year) ds =
do
year <- pNatural
return ds {yearState = Just year}
readDateToken (Literal l) ds = string l >> return ds
pNatural :: Num a => Parser a
pNatural = fromInteger . read <$> many1 digit
To read a date interpreting a [Token]
we will first convert it into a list of functions that decide how to parse a new state based on the current state with map readDateToken :: [Token] -> [DateState -> Parser DateState]
. Then, starting with a parser that succeeds with the initial state return (DateState Nothing Nothing Nothing)
we will bind all of these functions together with >>=
. If the resulting DateState
doesn't completely define the Date
we will complain that the [Token]
s was invalid. We also could have checked this ahead of time. If you want to include invalid date errors as parsing errors this would also be the place to check that the Date
is valid and doesn't represent a non-existent date like April 31st.
readDate :: [Token] -> Parser Date
readDate tokens =
do
dateState <- foldl (>>=) (return (DateState Nothing Nothing Nothing)) . map readDateToken $ tokens
case dateState of
DateState (Just day) (Just month) (Just year) -> return (Date day month year)
_ -> fail "Date format is incomplete"
We will run a few examples.
runp p s = runParser p () "runp" s
main = do
print . runp (readDate [DayNumber,Literal "-",MonthNumber,Literal "-",Year]) $ "12-3-456"
print . runp (readDate [Literal "~~", MonthNumber,Literal "hello",DayNumber,Literal " ",Year]) $ "~~3hello12 456"
print . runp (readDate [DayNumber,Literal "-",MonthNumber,Literal "-",Year,Literal "-",Year]) $ "12-3-456-789"
print . runp (readDate [DayNumber,Literal "-",MonthNumber]) $ "12-3"
This results in the following outputs. Notice that when we asked to read the Year
twice, the second of the two years was used in the Date
. You can choose a different behavior by modifying the definitions for readDateToken
and possibly modifying the DateState
type. When the [Token]
didn't specify how to read one of the date fields we get the error Date format is incomplete
with a slightly incorrect description; this could be improved upon.
Right (Date 12 3 456)
Right (Date 12 3 456)
Right (Date 12 3 789)
Left "runp" (line 1, column 5):
unexpected end of input
expecting digit
Date format is incomplete
Upvotes: 4