BrainFRZ
BrainFRZ

Reputation: 407

How do you apply multiple cases in a replace in Haskell?

I wrote a function to be able to clean word numbers for processing in Haskell. It needs to be able to change - into spaces (i.e. forty-five becomes forty five) and delete every other non-letter. I can define it recursively, but I'd really like to do something cleaner.

clean :: String -> String
clean "" = ""
clean ('-':cs) = ' ' : clean cs
clean (c:cs)
    | isLetter c  = c : clean cs
    | otherwise   = clean cs

This led me to defining a custom filter and defining a replace from Data.List.Split based on a comment to this answer, since I'm already using Data.List.Split.

clean :: String -> String
clean = filter (\c -> isLetter c || c == ' ') . replace "-" " " . filter (/= ' ')
  where
    replace :: String -> String -> String -> String
    replace old new = intercalate new . splitOn old

This version is even messier as a whole. Also, this version doesn't remove spaces in the original string. Is a different convention or something built-in that would allow me to do this with use a clean one-liner?

Upvotes: 3

Views: 835

Answers (3)

Jon Purdy
Jon Purdy

Reputation: 54989

This is a very good use case for do notation in the list monad.

clean :: String -> String
clean string = do
  character <- string         -- For each character in the string...
  case character of
    '-'            -> " "     -- If it’s a dash, replace with a space.
    c | isLetter c -> pure c  -- If it’s a letter, return it.
    _              -> []      -- Otherwise, discard it.

This is ultimately simple syntactic sugar for concatMap. pure c can also be written [c] if you prefer; and less importantly, " " can be written pure ' ' or [' ']. And as an alternative, you may find this more readable with the MultiWayIf extension:

if
  | character == '-'   -> " "
  | isLetter character -> pure character
  | otherwise          -> []

Finally, note that isLetter returns true for all Unicode letters. If you only care about ASCII, you may want to use isAscii c && isLetter c, or isAsciiUpper c || isAsciiLower c.

Upvotes: 2

willeM_ Van Onsem
willeM_ Van Onsem

Reputation: 476709

There are two things here:

  1. you need to remove everything that is not a letter or hyphen; and
  2. next we replace hyphens with spaces.

So we can do this with a pipeline with filter and replace:

  import Data.Bool(bool)
  import Data.Char(isLetter)

   map (\x -> bool ' ' x (x /= '-')) . filter (\x -> isLetter x || x == '-')
-- \____________ __________________/   \______________ ____________________/
--              v                                     v
--             (2)                                   (1)

We can use list comprehension to do the mapping and filtering like:

import Data.Bool(bool)
import Data.Char(isLetter)

clean l = [bool ' ' x (x /= '-') | x <- l, isLetter x || x == '-']

We can also use a single function, and perform for instance a concatMap:

import Data.Bool(bool)
import Data.Char(isLetter)

concatMap (\x -> bool (bool "" " " (x == '-')) [x] (isLetter x))

So here we concatenate the mapping of x to "" in case x is not a letter and the hyphen, or the empty string in case it is not a letter nor the hyphen, or [x] (so a 1-char string) in case x is a a letter.

Upvotes: 4

castletheperson
castletheperson

Reputation: 33486

One of the most powerful functions for dealing with lists is concatMap (a.k.a. >>=). You can write your clean function like so:

clean :: String -> String
clean = concatMap (\c -> if c == '-' then " " else [c | isLetter c])

Upvotes: 7

Related Questions