Raphm
Raphm

Reputation: 105

Working with list of tuples

I've been trying to solve this, but I just can't figure it out. So, I've a list with tuples, for example:

[("Mary", 10), ("John", 45), ("Bradley", 30), ("Mary", 15), ("John", 10)]

and what I want to get is a list with also tuples where, if the name is the same, the numbers of those tuples should be added and, if not, that tuple must be part of the final list too, exemplifying:

[("Mary",25), ("John", 55), ("Bradley", 30)]

I don't know if I explained myself really well, but I think you'll probably understand with the examples.

I've tried this, but it doesn't work:

test ((a,b):[]) = [(a,b)]
test ((a,b):(c,d):xs) | a == c = (a,b+d):test((a,b):xs)
                      | otherwise = (c,d):test((a,b):xs)

Upvotes: 2

Views: 7508

Answers (4)

Alfonso Villén
Alfonso Villén

Reputation: 373

Here are my two cents. Using just the Haskell Prelude.

test tup = sumAll
  where
    collect ys [] = ys
    collect ys (x:xs) =
        if (fst x) `notElem` ys
        then collect (fst x : ys) xs
        else collect ys xs
    collectAllNames = collect [] tup

    sumOne [] n x = (x, n)
    sumOne (y:ys) n x =
        if fst y == x
        then sumOne ys (n + snd y) x
        else sumOne ys n x

    sumAll = map (sumOne tup 0) collectAllNames

This method traverses the original list several times. Collect builds a temporary list holding just the names, skipping name repetitions. sumOne takes a name, checks what names in the list matches, and adds their numbers. It returns the name as well as the sum.

Upvotes: 0

user1891025
user1891025

Reputation: 131

Here's another way using lists:

import Data.List

answer :: [(String, Int)] -> [(String, Int)]
answer = map (foo . unzip) . groupBy (\x y -> fst x == fst y) . sort            
   where foo (names, vals) = (head names, sum vals)

It's a fairly straightforward approach. First, the dot (.) represents function composition which allows us to pass values from one function to the next, that is, the output of one becomes the input of the next, and so on. We start by applying sort which will automatically move the names next to one another in the list. Next we use groupBy to put each pair with similar names into a single list. We end up with a list of lists, each containing pairs with similar names:

[[("Bradley",30)], [("John",10),("John",45)], [("Mary",10),("Mary", 15)]]

Given such a list, how would you handle each sublist? That is, how would you handle a list containing all the same names?

Obviously we wish to shrink them down into a single pair, which contains the name and the sum of the values. To accomplish this, I chose the function (foo . unzip), but there are many other ways to go about it. unzip takes a list of pairs and creates a single pair. The pair contains 2 lists, the first with all the names, the second with all the values. This pair is then passed to foo by way of function composition, as discussed earlier. foo picks it apart using a pattern, and then applies head to the names, returning only a single name (they're all the same), and applying sum to the list of values. sum is another standard list function that sums the values in a list, naturally.

However, this (foo . unzip) only applies to a single list of pairs, yet we have a list of lists. This is where map comes in. map will apply our (foo . unzip) function to each list in the list, or more generally, each element in the list. We end up with a list containing the results of applying (foo . unzip) to each sublist.

I would recommend looking at all the list functions used in Data.List.

Upvotes: 4

C. A. McCann
C. A. McCann

Reputation: 77384

Doing this sort of thing is always awkward with lists, because of their sequential nature--they don't really lend themselves to operations like "find matching items" or "compute a new list by combining specific combinations of list elements" or other things that are by nature non-sequential.

If you step back for a moment, what you really want to do here is, for each distinct String in the list, find all the numbers associated to it and add them up. This sounds more suited to a key-value style data structure, for which the most standard in Haskell is found in Data.Map, which gives you a key-value map for any value type and any ordered key type (that is, an instance of Ord).

So, to build a Map from your list, you can use the fromList function in Data.Map... which, conveniently, expects input in the form of a list of key-value tuples. So you could do this...

import qualified Data.Map as M

nameMap = M.fromList [("Mary", 10), ("John", 45), ("Bradley", 30), ("Mary", 15), ("John", 10)]

...but that's no good, because inserting them directly will overwrite the numbers instead of adding them. You can use M.fromListWith to specify how to combine values when inserting a duplicate key--in the general case, it's common to use this to build a list of values for each key, or similar things.

But in your case we can skip straight to the desired result:

nameMap = M.fromListWith (+) [("Mary", 10), ("John", 45), ("Bradley", 30), ("Mary", 15), ("John", 10)]

This will insert directly if it finds a new name, otherwise it will add the values (the numbers) on a duplicate. You can turn it back into a list of tuples if you like, using M.toList:

namesList = M.toList $ M.fromListWith (+) [("Mary", 10), ("John", 45), ("Bradley", 30), ("Mary", 15), ("John", 10)]

Which gives us a final result of [("Bradley",30),("John",55),("Mary",25)].

But if you want to do more stuff with the collection of names/numbers, it might make more sense to keep it as a Map until you're done.

Upvotes: 8

user1020786
user1020786

Reputation:

I think the reason your potential solution did not work, is that it will only group elements together if they occur sequentially with the same key in the list. So instead, I'm going to use a map (often called a dictionary if you've used other languages) to remember which keys we've seen and keep the totals. First we need to import the functions we need.

import Data.Map hiding (foldl, foldl', foldr)
import Data.List (foldl')

Now we can just fold along the list, and for each key value pair update our map accordingly.

sumGroups :: (Ord k, Num n) => [(k, n)] -> Map k n
sumGroups list = foldl' (\m (k, n) -> alter (Just . maybe n (+ n)) k m) empty list

So, foldl' walks along the list with a function. It calls the function with each element (here the pair (k, n)), and another argument, the accumulator. This is our map, which starts out as empty. For each element, we alter the map, using a function from Maybe n -> Maybe n. This reflects the fact the map may not already have anything in it under the key k - so we deal with both cases. If there's no previous value, we just return n, otherwise we add n to the previous value. This gives us a map at the end which should contain the sums of the groups. Calling the toList function on the result should give you the list you want.

Testing this in ghci gives:

 $ ghci
GHCi, version 7.6.1: http://www.haskell.org/ghc/  :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
Prelude> import Data.Map hiding (foldl, foldl', foldr)
Prelude Data.Map> import Data.List (foldl')
Prelude Data.Map Data.List> let sumGroups list = foldl' (\m (k, n) -> alter (Just . maybe n (+ n)) k m) empty list
Loading package array-0.4.0.1 ... linking ... done.
Loading package deepseq-1.3.0.1 ... linking ... done.
Loading package containers-0.5.0.0 ... linking ... done.
Prelude Data.Map Data.List> toList $ sumGroups $ [("Mary", 10), ("John", 45), ("Bradley", 30), ("Mary", 15), ("John", 10)]
[("Bradley",30),("John",55),("Mary",25)]
Prelude Data.Map Data.List> 

The groups come out in sorted order as a bonus, because internally map uses a form of binary tree, and so it's relatively trivial to traverse in order and output a sorted (well, sorted by key anyway) list.

Upvotes: 1

Related Questions