Reputation: 11
I hope I was clear about my question!
Any help would be appreciated!
Upvotes: 0
Views: 243
Reputation: 47042
As you are learning, here's how to do it from scratch.
import qualified Data.Set as S
First, the set of word boundaries:
wordBoundaries :: S.Set Char
wordBoundaries = S.fromList " ."
(Data.Set.fromList
takes a list of elements; [Char]
is the same as String
, which is why we can pass a string in this case.)
Next, splitting a string into words:
toWords :: String -> [String]
toWords = fst . foldr cons ([], True)
where
The documentation for fst
and foldr
is pretty clear, but that for .
is a bit terse if you've not encountered function composition before.
The argument given to toWords
is fed to the foldr cons ([], True)
. .
then takes the result from foldr cons ([], True)
and feeds it to fst
. Finally, the result from fst
is used as the result from toWords
itself.
We have still to define cons
:
cons :: Char -> ([String], Bool) -> ([String], Bool)
cons ch (words, startNew)
| S.member ch wordBoundaries = ( words, True)
| startNew = ([ch] : words, False)
cons ch (word : words, _) = ((ch : word) : words, False)
Homework: work out what cons
does and how it works. This may be easier if you first ensure you understand how foldr
calls it.
Upvotes: 0
Reputation: 35089
You want Data.List.Split, which covers the vast majority of splitting use cases.
For your example, just use:
splitOneOf ".,!?"
And if you want to get rid of the "empty words" between consecutive delimiters, just use:
filter (not . null) . splitOneOf ".,!?"
If you want those delimiters to come from set that you already stored them in, then just use:
import qualified Data.Set as S
s :: S.Set Char
split = filter (not . null) . splitOneOf (S.toList s)
Upvotes: 1
Reputation: 3273
The function words
from the Prelude
will filter out spaces for you (a good way to find functions by desired type is Hoogle).
Prelude> :t words
words :: String -> [String]
You just need to compose this with an appropriate filter that makes use of Set
. Here's a really basic one:
import Data.Set (Set, fromList, notMember)
parser :: String -> [String]
parser = words . filter (`notMember` delims)
where delims = fromList ".,!?"
parser "yeah. what?"
Will return ["yeah", "what"]
.
Check out Learn You A Haskell for some good introductory material.
Upvotes: 2