Reputation: 8609
group :: Ord a => [(a, [b])] -> [(a, [b])]
I want to look up all pairs that have the same fst, and merge them, by appending all the list of bs together where they have the same a and discarding the unnessecary pair and so on...
I got as far as:
group ((s, ls):(s', ls'):ps) =
if s == s'
then group ((s, ls++ls'):ps)
else (s, ls) : group ((s', ls'):ps)
group p = p
but obviously this ain't going to cut it, because it doesn't group everything.
Edit: example
[("a", as),("c", cs), ("c", cs3), ("b", bs),("c", cs2), ("b", bs2)]
would output
[("a", as),("c", cs++cs2++cs3),("b", bs++bs2)]
Upvotes: 3
Views: 629
Reputation: 61489
Two alternative solutions to barkmadley's answer:
As Tirpen notes in a comment, the best way to attack this problem depends on the number m of distinct first elements in the tuples of the input list. For small values of m barkmadley's use of Data.List.partition
is the way to go. For large values however, the algorithm's complexity of O(n * m) is not so nice. In that case an O(n log n) sort of the input may turn out to be faster. Thus,
import Data.List (groupBy, sortBy)
combine :: (Ord a) => [(a, [b])] -> [(a, [b])]
combine = map mergeGroup . myGroup . mySort
where
mySort = sortBy (\a b -> compare (fst a) (fst b))
myGroup = groupBy (\a b -> fst a == fst b)
mergeGroup ((a, b):xs) = (a, b ++ concatMap snd xs)
This yields [("Dup",["2","3","1","5"]),("Non",["4"])]
on barkmadley's input.
Alternatively, we can call in the help of Data.Map
:
import Data.Map (assocs, fromListWith)
combine :: (Ord a) => [(a, [b])] -> [(a, [b])]
combine = assocs . fromListWith (++)
This will yield [("Dup",["5","1","2","3"]),("Non",["4"])]
, which may or may not be an issue. If it is, then there are again two solutions:
Reverse the input first using Data.List.reverse
:
import Data.List (reverse)
import Data.Map (assocs, fromListWith)
combine :: (Ord a) => [(a, [b])] -> [(a, [b])]
combine = assocs . fromListWith (++) . reverse
Prepend (flip (++)
) instead of append ((++)
) (Thanks to barkmadley; I like this solution better):
import Data.Map (assocs, fromListWith)
combine :: (Ord a) => [(a, [b])] -> [(a, [b])]
combine = assocs . fromListWith (flip (++))
Both of these definitions will cause combine
to output [("Dup",["2","3","1","5"]),("Non",["4"])]
.
As a last remark, note that all these definitions of combine
require the first element of the tuples in the input list to be instances of class Ord
. barkmadley's implementation only requires these elements to be instances of Eq
. Thus there exist inputs which can be handled by his code, but not by mine.
Upvotes: 11
Reputation: 7074
Another solution, using a fold to accumulate the groups in a Map. Because of the Map this does require that a
is an instance of Ord
(BTW your original definition requires that a
is an instance of Eq
, which barkmadley has incorporated in his solution).
import qualified Data.Map as M
group :: Ord a => [(a, [b])] -> [(a, [b])]
group = M.toList . foldr insert M.empty
where
insert (s, l) m = M.insertWith (++) s l m
If you're a big fan of obscurity, replace the last line with:
insert = uncurry $ M.insertWith (++)
This omits the unnecessary m
and uncurry
breaks the (s, l)
pair out into two arguments s
and l
.
Upvotes: 0
Reputation: 5297
import Data.List hiding (group)
group :: (Eq a) => [(a, [b])] -> [(a, [b])]
group ((s,l):rest) = (s, l ++ concatMap snd matches) : group nonmatches
where
(matches, nonmatches) = partition (\x-> fst x == s) rest
group x = x
this function produces the result:
group [("Dup", ["2", "3"]), ("Dup", ["1"]), ("Non", ["4"]), ("Dup", ["5"])]
= [("Dup", ["2", "3", "1", "5"]), ("Non", ["4"])]
it works by filtering the remaining bits into two camps, the bits that match and the bits that dont. it then combines the ones that match and recurses on the ones that don't. This effectly means you will have one tuple in the output list per 'key' in the input list.
Upvotes: 6