Reputation: 479
Usually I call distinct on List to remove duplicates or turn it into a Set
. Now I have a List[MyObject]
. MyObject
is a case class, see below:
case class MyObject(s1: String, s2:String, s3:String)
Let's say we have the following cases:
val myObj1 = MyObject("", "gmail,com", "some text")
val myObj2 = MyObject("", "gmail,com", "")
val myObj3 = MyObject("some text", "gmail.com", "")
val myObj4 = MyObject("some text", "gmail.com", "some text")
val myObj5 = MyObject("", "ymail.com", "")
val myObj6 = MyObject("", "ymail.com", "some text")
val myList = List(myObj1, myObj2, myObj3, myObj4, myObj5, myObj6)
Two Questions:
s2
?s2
? I would consider two case objects the same when s2 == s2
. Do I need to turn the case class into a normal class and override equals? Do I need a my own Comparator for this or can I use some Scala API method to archive the same?Upvotes: 4
Views: 3886
Reputation: 1470
Here is a slightly safer way,
myList.groupBy(_.s2).values.flatMap(_.headOption).toList
Alternatively,
scala.reflect.internal.util.Collections.distinctBy(myList)(_.s2)
Upvotes: 2
Reputation: 149538
How can I count how many objects are affected? Duplicates based on the content of s2?
If you want to count how many objects are in each duplicate group (if you only want to know how many objects are going to be removed, subtract 1 from size):
myList.groupBy(_.s2).map(x => (x._1, x._2.size))
res0: scala.collection.immutable.Map[String,Int] = Map(ymail.com -> 2, gmail.com -> 2, gmail,com -> 2)
How can I make the List distinct based on s2?
myList.groupBy(_.s2).map(_._2.head)
res1: scala.collection.immutable.Iterable[MyObject] = List(MyObject(,ymail.com,), MyObject(some text,gmail.com,), MyObject(,gmail,com,some text))
Upvotes: 6