SavannahGemp
SavannahGemp

Reputation: 405

Partitioning a String into more pieces with separating char in Haskell

I have the following homework:

Define a function split :: Char -> String -> [String] that splits a string, which consists of substrings separated by a separator, into a list of strings.

Examples:

split '#' "foo##goo" = ["foo","","goo"]    
split '#' "#" = ["",""]

I have written the following function:

split :: Char -> String -> [String]
split c "" = [""]
split a "a" = ["",""]
split c st =  takeWhile (/=c) st : split c tail((dropWhile (/=c) st))

It does not compile, and I can't see why. TakeWhile adds all the characters which are not c to the result, then tail drops that c that was found already, and we recursively apply split to the rest of the string, gotten with dropWhile. The : should make a list of "lists" as strings are lists of chars in Haskell. Where is the gap in my thinking?

Update:

I have updated my program to the following:

my_tail :: [a]->[a]
my_tail [] = []
my_tail xs = tail xs

split :: Char -> String -> [String]
split c "" = [""]
split a "a" = ["",""]
split c st =  takeWhile (/=c) st ++ split c (my_tail(dropWhile (/=c) st))

I still get an error, the following: enter image description here

Why is the expected type [String] and then [Char]?

Upvotes: 3

Views: 909

Answers (1)

willeM_ Van Onsem
willeM_ Van Onsem

Reputation: 476557

The reason why this does not compile is because Haskell, sees your last clause as:

split c st = takeWhile (/=c) st : split c tail ((dropWhile (/=c) st))

It thus thinks that you apply three parameters to split: c, tail and ((dropWhile (/=c) st)). You should use brackets here, like:

split c st = takeWhile (/=c) st : split c (tail (dropWhile (/=c) st))

But that will not fully fix the problem. For example if we try to run your testcase, we see:

Prelude> split '#' "foo##goo"
["foo","","goo"*** Exception: Prelude.tail: empty list

tail :: [a] -> [a] is a "non-total" function. For the empty list, tail will error. Indeed:

Prelude> tail []
*** Exception: Prelude.tail: empty list

Eventually, the list will run out of characters, and then tail will raise an error. We might want to use span :: (a -> Bool) -> [a] -> ([a], [a]) here, and use pattern matching to determine if there is still some element that needs to be processed, like:

split :: Eq a => a -> [a] -> [[a]]
split _ [] = [[]]
split c txt = pf : rst
    where rst | (_:sf1) <- sf = split c sf1
              | otherwise = []
          (pf,sf) = span (c /=) txt

Here span (c /=) txt will thus split the non-empty list txt in two parts pf (prefix) is the longest prefix of items that are not equal to c. sf (suffix) are the remaining elements.

Regardless whether sf is empty or not, we emit the prefix pf. Then we inspect the suffix. We know that either sf is empty (we reached the end of the list), or that the the first element of sf is equal to c. We thus use pattern guard to check if this matches with the (_:sf1) pattern. This happens if sf is non-empty. In that case we bind sf1 with the tail of sf, and we recurse on the tail. In case sf1 is empty, we can stop, and thus return [].

For example:

Prelude> split '#' "foo##goo"
["foo","","goo"]
Prelude> split '#' "#"
["",""]

Upvotes: 4

Related Questions