sanyi14ka
sanyi14ka

Reputation: 829

How to apply a function for the first certain elements of an array in Scala?

Suppose we have an array:

val arr = Array[(String, Int)](("A", 3), ("B", 5), ("C", 2), ("D", 7), ("E", 4))

I want to calculate the cumulative sum of the numbers until it exceeds a threshold. Then I want to concatenate a ".p" suffix for each letter which appeared in the tuples before the threshold.

For instance, let the threshold be 14. In this case I would like an output:

(("A.p", 3), ("B.p", 5), ("C.p", 2), ("D", 7), ("E", 4))

Because 3 + 5 + 2 = 10 < 14, but 3 + 5 + 2 + 7 = 17 > 14.

What I tried is the following:

val resIndex = arr.map(_._2).scanLeft(0){_ + _}.tail.takeWhile(_ < 14).length 
val resSplit = arr.toList.splitAt(resIndex)
val res = resSplit._1.map(e => (e._1.concat(".p"), e._2)) :: resSplit._2

But I'm sure there is a more efficient and better way to achieve this.

Update!

Thank you for all of your answers! I made a little benchmark to compare the solutions and the most efficient way of doing this was with Assad Mendelson's improved solution.

enter image description here

For the benchmark I used a randomly generated array with 0.5 million of elements and a threshold = 10000.

For benchmark I used:

    def time[A](f: => A) = {
  val s = System.nanoTime
  val ret = f
  println("time: "+(System.nanoTime-s)/1e6+"ms")
  ret
}

time {
  println("jwvh solution")
  arr.foldLeft((0,Array[(String,Int)]())){ case ((sum,ar),(s,i)) =>
    if (sum + i < threshold) (sum+i, ar:+((s+".p", i)))
    else (sum,ar:+(s,i))
  }._2
}

Upvotes: 2

Views: 522

Answers (4)

Nhon Dinh
Nhon Dinh

Reputation: 221

Just simplify it using yield:

 var sum=0
 var result=for(var i <- arr) yield if(sum<14){
  sum+=i._2
   (i._1 + "p",i._2)
} else i

Upvotes: 1

jwvh
jwvh

Reputation: 51271

OK, here's my 2-cents.

arr.foldLeft((0,Array[(String,Int)]())){ case ((sum,ar),(s,i)) =>
  if (sum + i < 14) (sum+i, ar:+((s+".p", i)))
  else (sum,ar:+(s,i))
}._2
// res0: Array[(String, Int)] = Array((A.p,3), (B.p,5), (C.p,2), (D,7), (E,4))

Like the others, a single foldLeft will do the trick. It's just a little tricky keeping track of the individual elements as the result Array is built.

Upvotes: 2

Assaf Mendelson
Assaf Mendelson

Reputation: 12991

First you can simplify your code by combining the last two lines to:

val res = arr.indices.map(ind => if (ind < resIndex) (arr(ind)._1 + ".p", arr(ind)._2) else arr(ind))

The functional way is probably to use Yuval Itzchakov's answer.

If you want better performance then you can go less functional.

First notice that you go over the data twice. You can solve this by doing:

var sum = 0
val threshold = 14
for(v <- arr) yield if (sum < threshold) {
       sum += v._2
       (v._1 + ".p", v._2)
     } else {
       v
     }

You can improve the results even more by noticing that in practice you are creating multiple copies of the data. To solve this you can do:

var sum = 0
val threshold = 14
var ind = 0
while (sum + arr(ind)._2 < threshold) {
    sum += arr(ind)._2
    arr(ind) = (arr(ind)._1 + ".p", arr(ind)._2)
    ind += 1
}
val res = arr

That said, for smaller arrays I would go with the first optimization (joining the last two lines) as it is clearer to read.

Upvotes: 1

Yuval Itzchakov
Yuval Itzchakov

Reputation: 149528

You can do it in a single iteration with foldLeft:

val arr =
Array[(String, Int)](("A", 3), ("B", 5), ("C", 2), ("D", 7), ("E", 4))

val threshold = 14
val (values, sum): (List[(String, Int)], Int) =
  arr.foldLeft((List.empty[(String, Int)], 0)) {
    case ((accumulatedTuples, acc), (str, value)) =>
      val sum = acc + value
      sum match {
        case sum if sum < threshold =>
          ((s"$str.p", value) :: accumulatedTuples, sum)
        case _ => ((str, value) :: accumulatedTuples, acc)
      }
  }

values.foreach(println)
println(s"Sum: $sum")

Yields:

(E,4)
(D,7)
(C.p,2)
(B.p,5)
(A.p,3)

Sum: 10 

If the order of the values matters, we'll need to add a .reverse at the end, or alternatively foldRight instead.

Upvotes: 3

Related Questions