Reputation: 53826
Using below code I'm attempting to remove duplicate elements using the distinct
method on List :
class OrderDetSpecific(var size: Double,
var side: String,
var trade_id: Int,
var price: Double,
var time: String,
var code : String) {
override def toString : String = {
this.code+","+this.time+","+this.trade_id+","+this.price
}
}
val l = List(new OrderDetSpecific(1.0 , "1" , 1 , 1.0 , "10" , "a"),new OrderDetSpecific(1.0 , "1" , 1 , 1.0 , "10" , "a"))
println(l.size)
println(l.distinct.size)
returns :
defined class OrderDetSpecific
l: List[OrderDetSpecific] = List(a,10,1,1.0, a,10,1,1.0)
2
2
But as can see the duplicated elements are not being removed. Overriding the toString
method is utilized as a part of discovering duplicates and as the duplicates entries exist then the List l
should be of size 1 instead of 2 as a result of calling l.distinct.size
?
Update :
Converting to case class :
case class OrderDetSpecific(var size: Double,
var side: String,
var trade_id: Int,
var price: Double,
var time: String,
var code : String) {
override def toString : String = {
this.code+","+this.time+","+this.trade_id+","+this.price
}
}
val l = List(new OrderDetSpecific(1.0 , "1" , 1 , 1.0 , "10" , "a"),new OrderDetSpecific(1.0 , "1" , 1 , 1.0 , "10" , "a"))
println(l.size)
println(l.distinct.size)
now the duplicated elements are removed. Does a case class use value equality on equals which allows distinct
to behave as expected ?
When I override equals :
class OrderDetSpecific(var size: Double,
var side: String,
var trade_id: Int,
var price: Double,
var time: String,
var code : String) {
override def toString : String = {
this.code+","+this.time+","+this.trade_id+","+this.price
}
override def equals(that: Any): Boolean =
that match {
case that: OrderDetSpecific => {
time == that.time
}
case _ => false
}
}
val l = List(new OrderDetSpecific(1.0 , "1" , 1 , 1.0 , "10" , "a"),new OrderDetSpecific(1.0 , "1" , 1 , 1.0 , "10" , "a"))
println(l.size)
println(l.distinct.size)
The distinct element is not removed. As my override is on time
attribute which is not distinct shouldnt the duplicated element be removed ?
Upvotes: 1
Views: 680
Reputation: 48420
As my override is on time attribute which is not distinct shouldnt the duplicated element be removed ?
No, because distinct
is equivalent to distinctBy(identity)
and distinctBy
uses HashSet
which uses hashCode
to eliminate duplicates, however you have not provided an override for hashCode
. For example, without hashCode
override
class Foo(var x: Int) {
override def equals(obj: Any): Boolean = true
}
val a = new Foo(42)
val b = new Foo(42)
a.## == b.##
mutable.HashSet(a, b).size == 1
outputs
res0: Boolean = false
res1: Boolean = false
whilst with hashCode
override provided
class Foo(var x: Int) {
override def equals(obj: Any): Boolean = true
override def hashCode(): Int = scala.runtime.Statics.anyHash(x)
}
...
we get
res0: Boolean = true
res1: Boolean = true
However, there is no need to fiddle with these overrides, instead try
l.distinctBy(_.time)
which outputs
res0: List[OrderDetSpecific] = List(a,10,1,1.0)
Upvotes: 4