Jivan
Jivan

Reputation: 23078

Unique elements from a list according to a subset of fields

Given a record like

data Foo = Foo { fooName :: Text, fooAge :: Int, fooCity :: Text }

With a list of such elements, is there a function to remove duplicates on a subset of fields only, on the model of this hypothetical removeDupBy function?

foos =
  [
    Foo "john" 32 "London",
    Foo "joe" 18 "New York",
    Foo "john" 22 "Paris",
    Foo "john" 32 "Madrid",
    Foo "joe" 17 "Los Angeles",
    Foo "joe" 18 "Berlin"
  ]

> removeDupBy (\(Foo f) -> (fooName, fooAge)) foos 
[
    Foo "john" 32 "London",
    Foo "joe" 18 "New York",
    Foo "john" 22 "Paris",
    Foo "joe" 17 "Los Angeles"
]

I could implement my own but would prefer using one from a well-established library, which will probably be much more performant and be much more resilient against edge cases. I was thinking of using nub but I'm not sure how to map the actual Foo elements to the tuples (fooName, fooAge) that nub would filter out.

Upvotes: 2

Views: 109

Answers (2)

castletheperson
castletheperson

Reputation: 33496

Since you are dealing with only strings and numbers, you can use the Ord instance to remove duplicates efficiently, or even Hashable, which allows practically constant-time lookups.

Some functions which exactly match your desired signature are:

  1. nubOrdOn from the containers package
Data.Containers.ListUtils> nubOrdOn (\f -> (fooName f, fooAge f)) foos
  1. hashNubOn from the witherable package
Witherable> hashNubOn (\f -> (fooName f, fooAge f)) foos

You may find other options by searching on Hoogle for (a -> b) -> [a] -> [a]

If you need to do many operations like this, you may prefer to use Map or HashMap directly.

Upvotes: 2

Mark Seemann
Mark Seemann

Reputation: 233247

You can use nubBy:

Prelude Data.List> nubBy (\x y -> (fooName x, fooAge x) == (fooName y, fooAge y)) foos
[Foo {fooName = "john", fooAge = 32, fooCity = "London"},
 Foo {fooName = "joe", fooAge = 18, fooCity = "New York"},
 Foo {fooName = "john", fooAge = 22, fooCity = "Paris"},
 Foo {fooName = "joe", fooAge = 17, fooCity = "Los Angeles"}]

(Output formatted for enhanced readability)

Upvotes: 1

Related Questions