Fredrik Nylén
Fredrik Nylén

Reputation: 577

Parsec output on unicode (UTF-8) char

Just need to understand something related to Parsec.

parseTest (many1 alphaNum) "re2re1Δ"
"re2re1\916"
:t parseTest (many1 alphaNum) 
parseTest (many1 alphaNum) :: Text.Parsec.Prim.Stream s Data.Functor.Identity.Identity Char =>
 s -> IO ()

So, the output of the Unicode (should be UTF-8, since I am on OSX) is printed as the hex (?) code (should be the greek delta character). Now, the putChar does not make the same conversion inside the same ghci session (and the same terminal)

Text.Parsec.Char> putChar 'Δ'
Δ

How come? They should both be just 'Char' types somehow...?

Upvotes: 2

Views: 501

Answers (1)

Sibi
Sibi

Reputation: 48644

The reason here has got to do with the way show and putChar are implemented.

λ> show "re2re1Δ"
"\"re2re1\\916\""
λ> mapM_ putChar "re2re1Δ"
re2re1Δ

From the source you can see that Show instance for Char is defined like this:

instance  Show Char  where
    showsPrec _ '\'' = showString "'\\''"
    showsPrec _ c    = showChar '\'' . showLitChar c . showChar '\''

    showList cs = showChar '"' . showl cs
                 where showl ""       s = showChar '"' s
                       showl ('"':xs) s = showString "\\\"" (showl xs s)
                       showl (x:xs)   s = showLitChar x (showl xs s)

putChar is implemented like this:

putChar         :: Char -> IO ()
putChar c       =  hPutChar stdout c

The parseTest function is internally using the print function which itself internally uses show and that's why you are getting the Unicode codepoint value for delta.

Upvotes: 7

Related Questions