Reputation: 27835
I have something like this:
[[1,3,2],[5,6],[3,8],[10,11],[13,6]]
And I want to filter out the third and last list, so that it becomes:
[[1,3,2],[5,6],[10,11]]
The logic is to remove the lists if any of the numbers "appeared before" (in the case of [3,8]
, the number 3
appears in the first list).
I can only think of non-FP methods to do this. My goal is to learn Haskell in a "good way". How can this be done?
Upvotes: 1
Views: 72
Reputation: 476709
What about this: you use an accumulator like a Data.HashSet and then use some kind of filter approach like:
import Data.HashSet
import Data.Hashable
somefilter :: (Hashable a, Ord a) => [[a]] -> [[a]]
somefilter = sf empty
sf :: (Hashable a, Ord a) => Set a -> [[a]] -> [[a]]
sf hs (x:xs) | any (flip member hs) x = tl
| otherwise = x : tl
where tl = sf (foldr insert hs x) xs
sf _ [] = []
Then the demo:
*Main> somefilter [[1,3,2],[5,6],[3,8],[10,11],[13,6]]
[[1,3,2],[5,6],[10,11]]
You can of course use another datastructure (list, treeset, whatever) to store the values you've encountered thus far. A HashSet
is simply a reasonable choice because of the low lookup and insertion time complexity.
Edit
Evidently time complexity is not easy to measure for lazy programming. If you intend to generate the entire list, the time complexities can be derived from the documentation of Data.IntMap
.
Given that the total number of elements is N (not to be confused with the number of lists), and we consider the number of bits of the Int
W fixed (32 or 64), both insert
and lookup
are done in constant time. This means that the total time complexity is O(N).
For the nubBy
solution, although the time complexity for intersect
this probably runs in O(m n) with m the number of items in one list, and n the number of elements in another list. This is done for each list against all previous. So in case there are O(k) lists, the time complexity is O(k2 a2) with a the average number of items in a list. Since a k=N, the time complexity is thus approximately O(N2).
Edit 2
Based on your comment, you can rewrite the function in order to meet the new specifications:
import Data.HashSet
import Data.Hashable
somefilter2 :: (Hashable a, Ord a) => [[a]] -> [[a]]
somefilter2 = sf2 empty
sf2 :: (Hashable a, Ord a) => Set a -> [[a]] -> [[a]]
sf2 hs (x:xs) | any (flip member hs) x = sf2 hs xs
| otherwise = x : sf2 (foldr insert hs x) xs
sf2 _ [] = []
With the difference between somefilter
and somefilter2
:
*Main> somefilter [[1,3,2],[5,6],[3,8],[10,11],[8],[13,6]]
[[1,3,2],[5,6],[10,11]]
*Main> somefilter2 [[1,3,2],[5,6],[3,8],[10,11],[8],[13,6]]
[[1,3,2],[5,6],[10,11],[8]]
there is only a difference in the recursive part of sf
: now you only add elements to the Set
in case you "emit" an element.
Upvotes: 5