Reputation: 95

How to implement "words" function

In the Haskell library there is a function called "words", it takes a string and returns a list, elements of which are the words in the string. Basically it removes the spaces and stores the words. For example:

words "Hello there mister person guy" == ["Hello", "there", "mister", "person", "guy"]

Now we have been asked to implement this ourselves. I've been trying for two days so far, and I really can't come up with a solution. I've tried looking it up online, but couldn't really find anything. So I am asking you for help.

Keep in mind though that I am very much a beginner with Haskell, and therefore not familiar with advanced concepts, so be detailed in your explanation if possible :).

Upvotes: 2

Answers (6)

stefan.schwetschke

Reputation: 8932

Imperative solution

You can use Haskell just like an imperative language, using if constructs and variable assignments. Instead of loops, you can use recursion. This leads to the following (naive) solution:

(WARNING: Haskell gurus, please just look away)

wordsRecursive input =
    wordsRecursive2 [] [] input
    where
        wordsRecursive2 words currentWord input =
            if (length input == 0)
            then
                currentWord : words
            else 
                let (i:is) = input in
                if (i == ' ') 
                then
                    wordsRecursive2 (currentWord : words) [] is
                else 
                    wordsRecursive2 words (i:currentWord) is

This code is butt ugly. There are many obvious problems with it, but the main problem is that it uses recursion. There is a saying that "recursion is the GOTO of functional programming". So when possible we should replace recursion by some higher level function.

Solution using folding

In this case, we can simply use folding.

wordsFold input = (\ (word, words) -> word:words) $  foldl wordsFold2 ([], []) input
    where
        wordsFold2 (currentWord, words) ' ' = ([], currentWord : words)
        wordsFold2 (currentWord, words) i   = (i : currentWord, words)

This function uses folding instead of recursion. Folding applies "wordsFold2" to each character in the input string, together with some intermediate result. The intermediate result is the return of each previous application of wordsFold2. In this case it consists of the (partial) current word and a list of the already found (complete) words. The initial intermediate result consists of an empty current word and an empty list of already found words.

In each step, the function wordsFold2 does on the following thins

It either adds the current word to the list of already found words
Or it adds the current character to the (incomplete) current word

Which option it uses depends on the value of the current character.

The result of the folding operation is the last intermediate result. The part (\ (word, words) -> word:words) then adds the last word to the list of words.

Exercise

Oh and btw, because you should do your own homework, I have left an exercise for you. My solution presents the results "backwards". You have to tune it a little bit to get the correct result :-P

Upvotes: -1

vek

Reputation: 1543

At the beginning, you could think in cases/situations, when you traverse a list:

import Data.Char (isSpace)

noWord (x:xs)
  | isSpace x = noWord xs
  -- is still space, do nothing and check next.
  | not (isSpace x) = isWord [x] xs
  -- is a new word - apply isWord. 

noWord [] = []
  -- if the fun ends with space.


isWord akku (x:xs) 
  | not (isSpace x) = isWord (akku ++ [x]) xs
  -- is still a word, so add x to the word and continue.
  | isSpace x = akku : noWord xs
  -- is no more a word, save the word and do some noWord's.

isWord akku [] = akku : []
  -- when list is empty, give the cached word back and end the list.


myWords = noWord
  -- start with no word.


main = print $ myWords " hell  oh world! "
-- try it

gives you:

["hell","oh","world!"]

Upvotes: 0

Pedro Rodrigues

Reputation: 1749

Here's the basic idea, shouldn't be too hard to put it on code:

Start by removing all leading space characters from the string
Accumulate characters from the beginning of the string until you find a space. This will be the first word.
Go back to step 1 using the remaining part of the string.

Upvotes: 0

Luis Casillas

Reputation: 30237

Well, one way to approach this (or really, most problems) is to break it into smaller problems whose solutions you can combine to solve yours, and then attack those problems separately. There is a standard function from Data.List that you can use to solve this problem:

span :: (a -> Bool) -> [a] -> ([a], [a])
span, applied to a predicate p and a list xs, returns a tuple where first element is longest prefix (possibly empty) of xs of elements that satisfy p and second element is the remainder of the list.

So one possible strategy is this:

Use span to break the list into two pieces: the largest initial prefix that doesn't contain a space, and the rest
Recursively apply this strategy to the rest of the string (ignoring any initial spaces).

If you're not allowed to use library functions in your class, then what I'd do is I'd write my own version of the library function in question.

Upvotes: 4

bheklilr

Reputation: 54078

Basically it removes the spaces and stores the words

I know what you intend with that statement, but it should be more accurately stated as

It splits the string on spaces and returns a list of the words

This gives us a very specific case to start with. What this problem boils down to is finding the spaces. The easy way would be by using takeWhile:

myWords :: String -> [String]
myWords "" = []
myWords text = takeWhile (/= ' ') text : ???

I won't give you the full solution, but this should help you get started. You need to figure out what goes in the place of ???. Also, this won't be the exact same as the words function since it also handles repeated spaces:

> words "This  is   a test"
["This","is","a","test"]

So you'll have to figure out how to do that as well.

Upvotes: 2

jamshidh

Reputation: 12070

Sounds like homework, so I don't want to just write the answer, but....

Some hints-

Write your type signatures first. Before you write a function, use Hoogle on the signatures to see if something useful exists.
You will want to use recursion (ie - once you find a word, you can split it from the rest of the string, and reapply your words function on the remainder.)
Remember strings are just lists in Haskell, so check out the Data.List functions and see what might be useful.

Upvotes: 2

How to implement &quot;words&quot; function

Answers (6)

Imperative solution

Solution using folding

Exercise

Related Questions

How to implement "words" function