Reputation: 405
I have the following homework:
Define a function split :: Char -> String -> [String] that splits a string, which consists of substrings separated by a separator, into a list of strings.
Examples:
split '#' "foo##goo" = ["foo","","goo"]
split '#' "#" = ["",""]
I have written the following function:
split :: Char -> String -> [String]
split c "" = [""]
split a "a" = ["",""]
split c st = takeWhile (/=c) st : split c tail((dropWhile (/=c) st))
It does not compile, and I can't see why. TakeWhile adds all the characters which are not c to the result, then tail drops that c that was found already, and we recursively apply split to the rest of the string, gotten with dropWhile. The : should make a list of "lists" as strings are lists of chars in Haskell. Where is the gap in my thinking?
Update:
I have updated my program to the following:
my_tail :: [a]->[a]
my_tail [] = []
my_tail xs = tail xs
split :: Char -> String -> [String]
split c "" = [""]
split a "a" = ["",""]
split c st = takeWhile (/=c) st ++ split c (my_tail(dropWhile (/=c) st))
I still get an error, the following:
Why is the expected type [String] and then [Char]?
Upvotes: 3
Views: 909
Reputation: 476557
The reason why this does not compile is because Haskell, sees your last clause as:
split c st = takeWhile (/=c) st : split c tail ((dropWhile (/=c) st))
It thus thinks that you apply three parameters to split
: c
, tail
and ((dropWhile (/=c) st))
. You should use brackets here, like:
split c st = takeWhile (/=c) st : split c (tail (dropWhile (/=c) st))
But that will not fully fix the problem. For example if we try to run your testcase, we see:
Prelude> split '#' "foo##goo"
["foo","","goo"*** Exception: Prelude.tail: empty list
tail :: [a] -> [a]
is a "non-total" function. For the empty list, tail
will error. Indeed:
Prelude> tail []
*** Exception: Prelude.tail: empty list
Eventually, the list will run out of characters, and then tail
will raise an error. We might want to use span :: (a -> Bool) -> [a] -> ([a], [a])
here, and use pattern matching to determine if there is still some element that needs to be processed, like:
split :: Eq a => a -> [a] -> [[a]]
split _ [] = [[]]
split c txt = pf : rst
where rst | (_:sf1) <- sf = split c sf1
| otherwise = []
(pf,sf) = span (c /=) txt
Here span (c /=) txt
will thus split the non-empty list txt
in two parts pf
(prefix) is the longest prefix of items that are not equal to c
. sf
(suffix) are the remaining elements.
Regardless whether sf
is empty or not, we emit the prefix pf
. Then we inspect the suffix. We know that either sf
is empty (we reached the end of the list), or that the the first element of sf
is equal to c
. We thus use pattern guard to check if this matches with the (_:sf1)
pattern. This happens if sf
is non-empty. In that case we bind sf1
with the tail of sf
, and we recurse on the tail. In case sf1
is empty, we can stop, and thus return []
.
For example:
Prelude> split '#' "foo##goo"
["foo","","goo"]
Prelude> split '#' "#"
["",""]
Upvotes: 4