joe
joe

Reputation: 533

How do I pattern match with Data.Text in Haskell?

I am currently in the process of writing a parser in Haskell. I have the following code.

{-# LANGUAGE LambdaCase #-}
{-# LANGUAGE OverloadedStrings #-}

module Main where

import Data.Text

newtype Parser a = Parser { runParser :: Text -> Either Text (Text, a) }

char1 :: Char -> Parser Char
char1 c = Parser $ \case
    (x:xs) | x == c -> Right (xs, x)
    _               -> Left "Unexpected character"

It fails to compile with these two errors.

test.hs:12:6: error:
    • Couldn't match expected type ‘Text’ with actual type ‘[Char]’
    • In the pattern: x : xs
      In a case alternative: (x : xs) | x == c -> Right (xs, x)
      In the second argument of ‘($)’, namely
        ‘\case
           (x : xs) | x == c -> Right (xs, x)
           _ -> Left "Unexpected character"’
   |
12 |     (x:xs) | x == c -> Right (xs, x)
   |      ^^^^

test.hs:12:24: error:
    • Couldn't match type ‘[Char]’ with ‘Text’
      Expected type: Either Text (Text, Char)
        Actual type: Either Text ([Char], Char)
    • In the expression: Right (xs, x)
      In a case alternative: (x : xs) | x == c -> Right (xs, x)
      In the second argument of ‘($)’, namely
        ‘\case
           (x : xs) | x == c -> Right (xs, x)
           _ -> Left "Unexpected character"’
   |
12 |     (x:xs) | x == c -> Right (xs, x)
   |                        ^^^^^^^^^^^^^

I can fix the error by replacing the Text data type with String but I would prefer to use the Text data type.

Is there a way to pattern match with the Data.Text type without first explicitly converting it to a string first? Perhaps there is a GHC extension that would allow me to do this?

Thanks in advance.

Upvotes: 6

Views: 1162

Answers (2)

K. A. Buhr
K. A. Buhr

Reputation: 50904

A a refinement to @DanielWagner's answer, you can combine view patterns and pattern synonyms to do this. You'll need a new constructor in place of :, but it might look like:

{-# LANGUAGE PatternSynonyms #-}
{-# LANGUAGE ViewPatterns #-}

import Data.Text

pattern x :> xs <- (uncons -> Just (x, xs))
pattern Empty <- (uncons -> Nothing)

findInText :: (Char -> Bool) -> Text -> Maybe Char
findInText _ Empty = Nothing
findInText p (x :> xs) | p x = Just x
                       | otherwise = findInText p xs

The idea here is that a pattern x :> xs is a synonym for the pattern uncons -> Just (x, xs) which is a view pattern that operates by applying uncons to the scrutinee and pattern-matching the result with Just (x, xs) to population x and xs for the parent pattern.

As per the comment, there might be some concern about whether this usage ends up calling uncons more than once. With optimization entirely shut off (-O0), the generated core does have multiple uncons calls:

-- unoptimized -O0
findInText
  = \ ds ds1 ->
      case uncons ds1 of {
        Nothing -> Nothing;
        Just ipv ->
          case uncons ds1 of {
            Nothing -> ...

With optimization on (-O or -O2), everything gets inlined and the generated core is incredibly complicated because of the Unicode processing going on. However, if you also define:

findInText' :: (Char -> Bool) -> Text -> Maybe Char
findInText' p txt = case uncons txt of
  Nothing -> Nothing
  Just (x, xs) | p x -> Just x
               | otherwise -> findInText' p xs

it turns out that GHC compiles findInText' to:

findInText' = findInText

so it looks like in this case at least, GHC doesn't do any extra work as a result of the view patterns.

Upvotes: 7

Daniel Wagner
Daniel Wagner

Reputation: 152867

You can match on a call to uncons.

case uncons text of
    Just (x, xs) -> ...
    Nothing -> ...

View patterns let you do this within the pattern instead of within the scrutinee, but require you to say uncons once for each pattern.

case text of
    (uncons -> Just (x, xs)) -> ...
    (uncons -> Nothing) -> ...

Upvotes: 5

Related Questions