atis
atis

Reputation: 921

Parsing user input string to UTCTime type

I'm using opt-parse applicative to parse user input.

I'm trying to write a function that'll take a string as an input and return UTCTime.

parseTime :: String -> Maybe UTCTime

but I don't want it to fail unless the user inputs something completely off the mark.

So I pass a variable fallbackTime which is just reapplication of result of getCurrentTime.

So the new parseTime function looks like this

parseTime :: UTCTime -> String -> Maybe UTCTime



Examples

Input2019-05-04, 12:49pm

Parsed: 2019-05-04 12:49:29 UTC

so left to right goes from year to month to day to hour to min to sec.


If there's no day half identifier (AM/PM)

Input: 2019-05-04, 12:49

Parsed: 2019-05-04 00:49:29 UTC

should default to fallbackTime's half.


Now if the user doesn't give a specific year, I want it to default to fallbackTime's year.

Input: 05 04, 12:49

Parsed: 2020-05-04 00:49:29 UTC

Input: 04, 12:49

Parsed: 2020-05-04 00:49:29 UTC


If the user doesn't input year, month or minutes, I want those ones default to fallbackTime

Input: 04, 12

Parsed: 2020-05-04 00:49:29 UTC


If the user doesn't input anything but just one number

Input: 12

It should be considered hour and rest should be filled according to fallbackTime

Parsed: 2020-05-04 00:49:29 UTC


I want it to fail only if the input is too long

Input: 2019 05 04 05 06 07 08 09 10 11 12, 12:49

or just gibberish

Input: alpaca bob-cat regular-cat dog elephant


So basically missing things should be replaced with fallback things.

So far I've found two main strategies that work -

  1. Parse user input into UTCTime and then fix it.

  2. Fix user input string and then try to parse it into UTCTime


First strategy include functions like

gotYear  :: String -> Bool
gotMonth :: String -> Bool
gotDay   :: String -> Bool

which check if the user has given enough numbers to be parsed as at least hour and then filling up rest using fallbackTime

and

second strategy splits userinput at , if possible, first of that tuple is fed to a function like this

fixupDate :: UTCTime -> Maybe String -> Maybe String

and the second one is fed to a similar function

fixupTime :: UTCTime -> Maybe String -> Maybe String

Maybe in this takes care of missing date is there's no , delimiter.

Finally it is parsed like this

parseTimeM true defaultTimeLocale <format string> <result of parsing>

<result of parsing> looks like this

2019-05-04 12:45

which is parsed using this format string

"%Y-%-m-%-d %l:%M"

Both of those strategies work fine but the implementations spans several lines of code and feel extremely over-engineered.

Parsing the string into tokens and parsing delimiters both are fairly straightforward, the main problem is replacing missing things with fallbackTime's values.

Is there a more simple, more functional/application way of doing this?

I'm thinking something like this pseudo-code

parseTime userInputTime fallbackTime = UTCTime { userInputYear <|> fallbackYear
                                               , userInputMonth <|> fallbackMonth
                                               , ...
                                               , ...
                                               }

                where userInputYear  = getUserInputYear  userInputTime :: Maybe Year
                      userInputMonth = getUserInputMonth userInputTime :: Maybe Month 
                      ...
                      ...
                      fallbackYear   = getYear  fallbackTime  :: Maybe Year
                      fallbackMonth  = getMonth fallbackTime  :: Maybe Month

or should I stick with one of those strategies and try to make the code more readable?

Upvotes: 0

Views: 62

Answers (1)

Bob Dalgleish
Bob Dalgleish

Reputation: 8227

I have one general comment, that you are biting off a large chunk of functionality with this approach, and, as you suggested, it is greatly over-engineered.

  1. You should not provide the seconds value when it is not present in the source. If seconds is not present in the source, its value is zero.
  2. Without an AM/PM indicator, 12:49 should be 12:49:00. Don't set the hour field to zero.
  3. When the year is missing, you will run afoul very quickly of people who use the American-style day-then-month notation.
  4. Similarly, do not try to guess the month. The hour should default to 00 if not present, as you would for minutes and seconds.
  5. One number, by itself, is so lacking in context that you cannot reasonably assume its significance. Stay away from this.

You are greatly over-engineering something that is fraught with pitfalls. Any code solution you attempt will necessarily be long and convoluted.

Also, if you need half-a-page of documentation to explain your choices to the user of your system, you are engineering the wrong end of the problem.

Upvotes: 2

Related Questions