BufBills
BufBills

Reputation: 8113

Why does "nub" have O(n^2) complexity when it could be O(n)?

The nub function from Data.list has a O(n2) complexity. It's clear that implementing an O(n) algorithm is doable and not hard. Why doesn't Haskell do it?

nub :: Eq a => [a] -> [a]
base Data.List
O(n^2). The nub function removes duplicate elements from a list. In particular,
it keeps only the first occurrence of each element. (The name nub means
`essence'.) It is a special case of nubBy, which allows the programmer to supply
their own equality test. 

Upvotes: 0

Views: 2040

Answers (1)

Thomas M. DuBuisson
Thomas M. DuBuisson

Reputation: 64740

To provide the answer that has been made rather painfully explicit in the comments: Your premise is wrong, nub :: Eq a => [a] -> [a] can not have an O(n) implementation.

You are probably thinking of an implementation that can assume ordering, ordNub :: Ord a => [a] -> [a], this is usually linear log. That or perhaps you are assuming a hashable, bucket-sortable sort of thing. Unlike with Eq, when you have ordering information you don't need to potentially compare every pair of elements.

This topic was discussed in 2008: http://haskell.1045720.n5.nabble.com/GHC-2717-Add-nubWith-nubOrd-td3159919.html

This topic was discussed again, with almost completely different actors in the community, in 2013: https://www.haskell.org/pipermail/haskell-cafe/2013-October/110984.html

Upvotes: 35

Related Questions