bourneli
bourneli

Reputation: 2230

scala range split missing the last one

I use scala Range.by to split an range to get an array, but it miss the last one for some special bucket num, for example 100. I am puzzled, and demo as following:

object SplitDemo extends App {
  val min = 0.0
  val max = 7672.142857142857
  val bucketNum = 100

  def splitsBucket1(min: Double, max: Double, num: Int) = (min to max by ((max - min) / num)).toArray
  def splitsBucket2(min: Double, max: Double, num: Int): Array[Double] = {
    val rst = Array.fill[Double](num + 1)(0)
    rst(0) = min
    rst(num) = max

    val step = (max-min)/num
    for(i <- 1 until num) rst(i) = rst(i-1)+step

    rst
  }

  val split1 = splitsBucket1(min, max, bucketNum)
  println(s"Split1 size = ${split1.size}, %s".format(split1.takeRight(4).mkString(",")))

  val split2 = splitsBucket2(min, max, bucketNum)
  println(s"Split2 size = ${split2.size}, %s".format(split2.takeRight(4).mkString(",")))

}

the output is following

Split1 size = 100,7365.257142857143,7441.978571428572,7518.700000000001,7595.421428571429
Split2 size = 101,7441.978571428588,7518.700000000017,7595.421428571446,7672.142857142857

When num = 100, split1 misses the last one, but split2 not(which is my expection).When num is other num, e.t. 130, split1 and split2 get the sample result.
What's the reason to casuse the difference?

Upvotes: 2

Views: 206

Answers (1)

Kolmar
Kolmar

Reputation: 14224

It's the usual floating point inaccuracy.

Look, how the max comes out differently after dividing and multiplicating it back:

scala> 7672.142857142857 / 100 * 100
res1: Double = 7672.142857142858

And this number is larger than max, so it doesn't fit into the range:

scala> max / bucketNum * bucketNum > max
res2: Boolean = true

It's still more correct than adding the step 100 hundred times in splitsBucket2:

scala> var result = 0.0
result: Double = 0.0

scala> for (_ <- 0 until 100) result += (max - min) / bucketNum

scala> result
res4: Double = 7672.142857142875

This is larger than both max and max / bucketNum * bucketNum. You avoid this in splitBuckets2 by explicitly assigning rst(num) = max though.


You can try the following split implementation:

def splitsBucket3(min: Double, max: Double, num: Int): Array[Double] = {
  val step = (max - min) / num
  Array.tabulate(num + 1)(min + step * _)
}

It is guaranteed to have the correct number of elements, and has less numeric precision problems than splitsBucket2.

Upvotes: 2

Related Questions