Abgo80
Abgo80

Reputation: 253

Haskell How to count and group strings in a list

I am learning haskell and need some help in figuring out logic for this function. I only want to do the problem using functions in standard prelude and recursion if possible.

So I have a list of things eg:

["Abhi", "Stack","how", "Abhi"]

Goal is to convert this into list of pair with a count and list item output : [("Abhi", 2), ("Stack",1)..]

I came up with the following functions

countList :: [String] -> [(Int,String)]
countList [] = []
countList xss@(x:xs) = (x,checkCount x xss) : countList xs

checkCount :: String -> [String] -> Int
checkCount _ [] = 0
checkCount str (x:xs) | str == x = 1 + checkCount str xs
                      | otherwise = 0 + checkCount str xs

The output I get from this is :

[("Abhi",2), ("Stack",1), ("how",1),("Abhi",1)]

Notice how the last item is Abhi, 1. I can't find a recursive way to fix it.

Upvotes: 2

Views: 2379

Answers (6)

S4eed3sm
S4eed3sm

Reputation: 1478

I know you want to do this by Prelude, but as I searched for a Solutions with HashMap and I can't found a good solution, I want to share my solution with people that can use HashMap in their code:

I got help from this solution.

import qualified Data.HashMap.Strict as MP
l = ["a", "a", "v", "b", "b", "b"]
result = foldl (\mp x -> MP.insertWith (+) x 1 mp)  MP.empty l

the result is:

fromList [("b",3),("v",1),("a",2)]

and you can lookup a key by

MP.lookup "a" result

and its output will be

Just 2

Upvotes: 0

ely
ely

Reputation: 77424

Here is one way to do it with a series of helper functions. I assume that extra verbosity is OK since the solution needs to use only Prelude ... a concise solution would be a direct application of helper functions that already exist in either Data.Map or Data.List. Probably some of what I have can be made shorter using fold and so on...

countByGroup :: (Eq a) => [a] -> [(a, Int)]
countByGroup [] = []
countByGroup xs = let (c:cs) = toCounts xs
                  in mergeCounts [c] cs

toCounts :: [a] -> [(a, Int)]
toCounts xs = [(x, 1) | x <- xs]

mergeCounts :: (Eq a) => [(a, Int)] -> [(a, Int)] -> [(a, Int)]
mergeCounts = mergeCounts' []

mergeCounts' :: (Eq a) => [(a, Int)] -> [(a, Int)] -> [(a, Int)] -> [(a, Int)]
mergeCounts' acc [] [] = acc
mergeCounts' acc (l:ls) [] = case ls of
    [] -> acc ++ [l]
    otherwise -> acc ++ mergeCounts [l] ls
mergeCounts' acc [] (r:rs) = case rs of
    [] -> acc ++ [r]
    otherwise -> acc ++ mergeCounts [r] rs
mergeCounts' acc (l:ls) rs = 
    let rSum   = sum [snd r | r <- rs, fst r == fst l]
        lSum   = sum [snd x | x <- ls, fst x == fst l]
        newAcc = acc ++ [(fst l, snd l + rSum + lSum)]
        newRs  = [r | r <- rs, fst r /= fst l]
        newLs  = [x | x <- ls, fst x /= fst l]
    in  mergeCounts' newAcc newLs newRs

For example:

*Main> countByGroup ["a", "b", "a", "b"]
[("a",2),("b",2)]
*Main> countByGroup ["Abhi", "Stack","how", "Abhi"]
[("Abhi",2),("Stack",1),("how",1)]
*Main> countByGroup ["Abhi", "Stack","how", "Abhi", "how"]
[("Abhi",2),("Stack",1),("how",2)]
*Main> countByGroup ["Abhi", "Abhi"]
[("Abhi",2)]
*Main> countByGroup []
[]
*Main> countByGroup ["how", "Abhi", "Abhi", "how", "how"]
[("how",3),("Abhi",2)]

Upvotes: 0

גלעד ברקן
גלעד ברקן

Reputation: 23955

One problem in your code is the snippet, countList xs - it means you are not counting occurrences of each x in the full list but rather one that diminishes with each call of countList.

Why not add a parameter to your original countList of the form [(Int,String)]- let's call it, result.

As you traverse the list to count, pass x and result to the checkCount function, which in this case will traverse result. If it finds a match for x, it will increase the count for that x in result. If it does not find x, it will insert (1,x) in result.

Upvotes: 0

David Lilue
David Lilue

Reputation: 631

What happen on your function countList (checkCount is fine) is on this line countList xss@(x:xs) = (x,checkCount x xss) : countList xs. When you call recursivly to countList you lost the head x. So, the first recursive call will be like:

("Abni",2) : countList ["Stack","how", "Abhi"]

When it get to the next "Abni", there will be just one.

You need to keep the original list. Try this:

countList' :: [String] -> [(String, Int)]
countList' [] = []
countList' (s:ss) = fun s ss (s:ss) []
    where fun s ss sss acc
        | null ss   = (s, checkCount' s sss):acc
        | otherwise = fun (head ss) (tail ss) sss $ (s, checkCount' s sss):acc 

The above is with tail recursion. More simple:

countList :: [String] -> [(String, Int)]
countList [] = []
countList ss = hack ss ss
    where hack (x:xs) sss
        | null xs   = [(x, checkCount x sss)]
        | otherwise = (x, checkCount x sss) : (hack xs sss)

Upvotes: 0

Abgo80
Abgo80

Reputation: 253

bheklilr - Thanks for your awesome answer and explanation. Since I am a newb in Haskell it was little over my head in the beginning.

Anyway, I found another way to do it. I changed my approach to list recursion with index. It may not be optimal but at least works. Here it goes

Initially I was iterating using list values, so, I couldn't account for values that were removed. I decided to keep the list as is and iterate using indices.

countList' is a wrapper, so, I don't have to pass list with its length.

countList' :: [String] -> [(Int,String)]
countList' [] = []
countList' lst = countListWithIndex (length' lst) lst

CheckCount function was unchanged

checkCount :: String -> [String] -> Int
checkCount _ [] = 0
checkCount str (x:xs) | str == x = 1 + checkCount str xs
                      | otherwise = 0 + checkCount str xs

This function gets the number of repetitions in the list based on the index.

countListWithIndex :: Int -> [String] -> [(Int, String)]
countListWithIndex 0 _ = []
countListWithIndex n str = (checkCount string str, string) : countListWithIndex (n-1) str
                             where string = dataAt (n-1) str

Function to find data in a list using index

dataAt :: Int -> [a] -> a
dataAt _ [] = error "Empty List!"
dataAt y (x:xs)  | y <= 0 = x
                 | otherwise = dataAt (y-1) xs

Simple function to calculate length of a list

length' :: [a] -> Int
length' [] = 0
length' (_:xs) = 1 + length' xs

Also, I know the list still has duplicate values. I may have to write another function to remove them. Baby steps :)

Anyway would love to know feedback.

Upvotes: 0

bheklilr
bheklilr

Reputation: 54058

To do this with just Prelude functions and recursion, you can emulate a Map structure using a list of tuples. This will be less efficient, but it avoids having to use other modules. The core of the algorithm is just writing a function that updates a value in a Map:

type Map k v = [(k, v)]  -- Simulate Data.Map.Map

update :: Eq k => k -> (Maybe v -> v) -> Map k v -> Map k v
update k fv [] = [(k, fv Nothing)]
update k fv ((k', v'):rest)
    = if k == k'
        then (k', fv $ Just v') :             rest
        else (k',           v') : update k fv rest

This function just says to walk down the Map, if it finds the element then update it and stop, otherwise keep trying. If it hits the end of the Map then call the updating function with Nothing to produce a default value.

Next, we can use this to build up a Map k Int for this specific case:

countOccurs :: Eq k => [k] -> Map k Int
countOccurs = foldr go []
    where
        go k = update k updtr
        updtr Nothing  = 1
        updtr (Just i) = i + 1

But we can inline updtr to be just maybe 1 (+1):

countOccurs = foldr go []
    where
        go k = update k (maybe 1 (+1))

And since go is simple enough, let's just inline it using flip, or equivalently using backticks to make update infix:

countOccurs = foldr (`update` maybe 1 (+1)) []

We don't even need to put parens around maybe 1 (+1) since it is now seen as an argument to an operator.


As a demo:

> countOccurs $ words "Abhi Stack how Abhi"
[("Abhi", 2), ("how", 1), ("Stack", 1)]

> countOccurs [1, 1, 2, 3, 2, 2, 1, 1]
[(1, 4), (2, 3), (3, 1)]

Upvotes: 1

Related Questions