Reputation: 23078
Given a record like
data Foo = Foo { fooName :: Text, fooAge :: Int, fooCity :: Text }
With a list of such elements, is there a function to remove duplicates on a subset of fields only, on the model of this hypothetical removeDupBy
function?
foos =
[
Foo "john" 32 "London",
Foo "joe" 18 "New York",
Foo "john" 22 "Paris",
Foo "john" 32 "Madrid",
Foo "joe" 17 "Los Angeles",
Foo "joe" 18 "Berlin"
]
> removeDupBy (\(Foo f) -> (fooName, fooAge)) foos
[
Foo "john" 32 "London",
Foo "joe" 18 "New York",
Foo "john" 22 "Paris",
Foo "joe" 17 "Los Angeles"
]
I could implement my own but would prefer using one from a well-established library, which will probably be much more performant and be much more resilient against edge cases. I was thinking of using nub
but I'm not sure how to map the actual Foo
elements to the tuples (fooName, fooAge)
that nub
would filter out.
Upvotes: 2
Views: 109
Reputation: 33496
Since you are dealing with only strings and numbers, you can use the Ord
instance to remove duplicates efficiently, or even Hashable
, which allows practically constant-time lookups.
Some functions which exactly match your desired signature are:
nubOrdOn
from the containers packageData.Containers.ListUtils> nubOrdOn (\f -> (fooName f, fooAge f)) foos
hashNubOn
from the witherable packageWitherable> hashNubOn (\f -> (fooName f, fooAge f)) foos
You may find other options by searching on Hoogle for (a -> b) -> [a] -> [a]
If you need to do many operations like this, you may prefer to use Map
or HashMap
directly.
Upvotes: 2
Reputation: 233247
You can use nubBy:
Prelude Data.List> nubBy (\x y -> (fooName x, fooAge x) == (fooName y, fooAge y)) foos
[Foo {fooName = "john", fooAge = 32, fooCity = "London"},
Foo {fooName = "joe", fooAge = 18, fooCity = "New York"},
Foo {fooName = "john", fooAge = 22, fooCity = "Paris"},
Foo {fooName = "joe", fooAge = 17, fooCity = "Los Angeles"}]
(Output formatted for enhanced readability)
Upvotes: 1