Haskell: Removing duplicates tuples from a list?

Question

I'm trying to get from the before to after state. Is there a convenient Haskell function for removing duplicate tuples from a list? Or perhaps it is something a bit more complicated such as iterating through the entire list?

Before: the list of tuples, sorted by word, as in
   [(2,"a"), (1,"a"), (1,"b"), (1,"b"), (1,"c"), (2,"dd")]
After: the list of sorted tuples with exact duplicates removed, as in
   [(2,"a"), (1,"a"), (1,"b"), (1,"c"), (2,"dd")]

behzad.nouri · Accepted Answer

Searching for Eq a => [a] -> [a] on hoogle, returns nub function:

The nub function removes duplicate elements from a list. In particular, it keeps only the first occurrence of each element. (The name nub means `essence'.)

As in the documentation the more general case is nubBy.

That said, this is an O(n^2) algorithm and may not be very efficient. An alternative would be to use Data.Set.fromList if the values are an instance of Ord type-class, as in:

import qualified Data.Set as Set

nub' :: Ord a => [a] -> [a]
nub' = Set.toList . Set.fromList

though this will not maintain the order of the original list.

A simple set style solution which maintains the order of the original list can be:

import Data.Set (Set, member, insert, empty)

nub' :: Ord a => [a] -> [a]
nub' = reverse . fst . foldl loop ([], empty)
    where
    loop :: Ord a => ([a], Set a) -> a -> ([a], Set a)
    loop acc@(xs, obs) x
        | x `member` obs = acc
        | otherwise = (x:xs, x `insert` obs)

Haskell: Removing duplicates tuples from a list?

Answers (2)

Related Questions