Dan Ciborowski - MSFT
Dan Ciborowski - MSFT

Reputation: 7247

Scala Convert For Loop to Functional Method

I would like to convert the following for loop into a functional Scala method.

for (i <- 15 to 25){
  count_table_rdd = count_table_rdd.union(training_data.map(line => (i+"_"+line(i)+"_"+line(0), 1)).reduceByKey(_ + _))
}

I have tried look at the foreach method, but I do not want to transform every item, just 15 through 25.

Upvotes: 1

Views: 292

Answers (3)

maasg
maasg

Reputation: 37435

Taking this from the Spark perspective, it could be better to do this by transforming the trainingDataRDD instead of looping to select given columns.

Something like:

trainingData.flatMap(line => (15 to 25).map(i => (i+"_"+line(i)+"_"+line(0), 1)))
        .reduceByKey(_ + _)

This will be more efficient that joining pieces of an RDD together using union.

Upvotes: 1

curious
curious

Reputation: 2928

You may use tailrec too but @rex's method is what you should be following. It will not compile, specify Type of your count_table_rdd and res accordingly

tailrec version :

@annotation.tailrec
  def f(start: Int = 15, end: Int = 25,res:List[Your_count_table_rdd_Type]=Nil): List[Your_count_table_rdd_Type] = {
    if (start > end) count_table_rdd
    else {
     val temp = res ++ training_data.map(line => (start + "_" + line(start) + "_" + line(0), 1)).reduceByKey(_ + _)
      f(start + 1, end,temp)
    }
  }

  f()

you can specify start and end too.

f(30,45)

Upvotes: 1

Rex Kerr
Rex Kerr

Reputation: 167921

You can fold.

val result = (count_table_rdd /: (15 to 25)){ (c, i) => c.union(...) }

If you see that you've got a set of data and you're pushing a value through it doing updates to that value, you should reach for a fold because that's exactly what it does.

Upvotes: 3

Related Questions