Sean Clark Hess
Sean Clark Hess

Reputation: 16059

Combine equivalent items in a list

Let's say I have the following type

type Key = String
type Score = Int
data Thing = Thing Key Score

And if I have an array of them like this:

[Thing "a" 7, Thing "b" 5, Thing "a" 10]

Is there a standard way to reduce this so that I don't have any duplicate keys? If two keys match, I want to take the better score

[Thing "b" 5, Thing "a" 10]

Upvotes: 4

Views: 256

Answers (4)

pmr
pmr

Reputation: 59811

Here is my feeble attempt. There surely is a nicer way but I'm not much of a Haskell programmer.

import Data.List

type Key = String
type Score = Int
data Thing = Thing Key Score
           deriving (Show, Ord)

instance Eq Thing where
    (Thing k1 _) == (Thing k2 _) = k1 == k2
    (Thing k1 _) /= (Thing k2 _) = k1 /= k2

thingSort :: [Thing] -> [Thing]
thingSort = Data.List.sortBy (flip compare)

ex = [Thing "a" 7, Thing "b" 5, Thing "a" 10]

filtered = nub (thingSort ex)

Upvotes: 0

hugomg
hugomg

Reputation: 69934

I'm posting a O(n log n) solution, since everyone seems fine with being O(n^2)

consolidate :: (Ord a, Ord b) => [Thing a b] -> [Thing a b]
consolidate xs = 
    max_from_each_group (sortBy (compare `on` getKey) xs)
    where
       max_from_each_group [] = []
       max_from_each_group (x:xs) = 
           let (same_key, rest) = span (\t -> x == getKey t) xs in
           let group_max = maximumBy (compare `on` getValue) (x:same_key) in
           group_max : max_from_each_group rest

Upvotes: 1

hammar
hammar

Reputation: 139850

A very simple solution is to use Data.Map.fromListWith, which converts a list of key-value pairs to a map, given a function to combine multiple values with the same key.

Prelude Data.Map> fromListWith max [("a", 7), ("b", 5), ("a", 10)]
fromList [("a",10),("b",5)]

Note that this expects tuples, so convert as necessary. Also, it does not preserve the order of the input elements. Run time is O(n log n).

Upvotes: 6

Tarrasch
Tarrasch

Reputation: 10547

Basically first we must decide what is problem solving and what is implementation difficulties. So what If we first sort by Score, and then just keep the first occurrences in the sorted list with respect to Key? That should work, let's look at the haskell implementation:

import Data.List
import Data.Function

type Key = String
type Score = Int
data Thing = Thing { key :: Key, score :: Score }
  deriving (Show)

myNub  = nubBy  ((==) `on` key)
mySort = sortBy (compare `on` (negate . score))

selectFinest = myNub . mySort

Now we try run this in ghci:

Prelude> :load Test.hs 
[1 of 1] Compiling Main             ( Test.hs, interpreted )
Ok, modules loaded: Main.
*Main> selectFinest [Thing "a" 7, Thing "b" 5, Thing "a" 10]
[Thing {key = "a", score = 10},Thing {key = "b", score = 5}]

Checkout hoogle if you are uncertain about the functions I used in the solution. It indeed takes some time to learn how to use on and those functions.

Upvotes: 5

Related Questions