Reputation: 2230
I use scala Range.by to split an range to get an array, but it miss the last one for some special bucket num, for example 100. I am puzzled, and demo as following:
object SplitDemo extends App {
val min = 0.0
val max = 7672.142857142857
val bucketNum = 100
def splitsBucket1(min: Double, max: Double, num: Int) = (min to max by ((max - min) / num)).toArray
def splitsBucket2(min: Double, max: Double, num: Int): Array[Double] = {
val rst = Array.fill[Double](num + 1)(0)
rst(0) = min
rst(num) = max
val step = (max-min)/num
for(i <- 1 until num) rst(i) = rst(i-1)+step
rst
}
val split1 = splitsBucket1(min, max, bucketNum)
println(s"Split1 size = ${split1.size}, %s".format(split1.takeRight(4).mkString(",")))
val split2 = splitsBucket2(min, max, bucketNum)
println(s"Split2 size = ${split2.size}, %s".format(split2.takeRight(4).mkString(",")))
}
the output is following
Split1 size = 100,7365.257142857143,7441.978571428572,7518.700000000001,7595.421428571429
Split2 size = 101,7441.978571428588,7518.700000000017,7595.421428571446,7672.142857142857
When num = 100, split1 misses the last one, but split2 not(which is my expection).When num is other num, e.t. 130, split1 and split2 get the sample result.
What's the reason to casuse the difference?
Upvotes: 2
Views: 206
Reputation: 14224
It's the usual floating point inaccuracy.
Look, how the max
comes out differently after dividing and multiplicating it back:
scala> 7672.142857142857 / 100 * 100
res1: Double = 7672.142857142858
And this number is larger than max
, so it doesn't fit into the range:
scala> max / bucketNum * bucketNum > max
res2: Boolean = true
It's still more correct than adding the step
100 hundred times in splitsBucket2
:
scala> var result = 0.0
result: Double = 0.0
scala> for (_ <- 0 until 100) result += (max - min) / bucketNum
scala> result
res4: Double = 7672.142857142875
This is larger than both max
and max / bucketNum * bucketNum
. You avoid this in splitBuckets2
by explicitly assigning rst(num) = max
though.
You can try the following split implementation:
def splitsBucket3(min: Double, max: Double, num: Int): Array[Double] = {
val step = (max - min) / num
Array.tabulate(num + 1)(min + step * _)
}
It is guaranteed to have the correct number of elements, and has less numeric precision problems than splitsBucket2
.
Upvotes: 2