Reputation: 47
I'm trying to create a function that loops through an array of Strings, adds the word to a new tuple which counts how many times a word occurs in a block of text. In an OO language, this is simple- create a KV pair for each word and the number of times it occurs. I'm trying to translate that code into Haskell, but I don't think its that straightforward.
countWords:: [String] -> [(String, Int)]
I know I need to create a list of tuples but I'm not sure how to loop through the list passed to the function using recursion.
Upvotes: 1
Views: 160
Reputation: 120751
A pretty direct translation of what you seem to be saying you'd do in OO would be to “loop” for each word through the list recursively and either update the entry that has it already, or append it as a new one:
registerWord :: String -> [(String, Int)] -> [(String, Int)]
registerWord w ((w',c):ws)
| w==w' = (w,c+1) : ws
| otherwise = (w',c) : registerWord w ws
registerWord w [] = [(w,1)]
Then do that for every given word, each time updating the register. This is easily done with a fold:
countWords :: [String] -> [(String, Int)]
countWords = foldr registerWord []
This list-inserting is awkward though, and inefficient (both in FP and OO), namely O(n2). A much nicer approach is to think functional-modularly: you effectively want to group equal words together. For this, you need to first sort them, so equal words are actually adjacent. Then you need to replace each group of duplicates with a single example, and the count. Nice functional pipeline:
countWords :: [String] -> [(String, Int)]
countWords = map (\gp@(w:_) -> (w, length gp)) . group . sort
Incidentally, there's nothing in this function that requires the keys to be “words” / strings, so you might as well generalise the signature to
countWords :: Ord a => [a] -> [(a, Int)]
(The other, inefficient approach would be even more general, requiring only Eq
.)
Upvotes: 5