Twistleton
Twistleton

Reputation: 2935

How can I rewrite this function in a functional style of Scala?

I'm trying to rewrite a function in a functional style. I know the function is bad. But how can I avoid the usage of the var field watermarkTime?

The function sets day "1" for the rows with a continuous increasing timestamp. If the timestamp is lesser than the timestamp of the previous row, then set day "2".

case class Row(time: String, name: String, var day: Int)

val rows = List(new Row("09:18:52", "3_0711-082757_01001", 0),
                new Row("09:23:18", "3_0711-082757_01002", 0),
                new Row("09:33:43", "3_0711-082757_01004", 0),
                new Row("10:20:00", "3_0711-082757_01011", 0),
                new Row("05:03:38", "3_0711-082757_02001", 0),  // set secound day
                new Row("05:07:51", "3_0711-082757_02002", 0),
                new Row("05:13:02", "3_0711-082757_02003", 0),
                new Row("05:19:16", "3_0711-082757_02004", 0),
                new Row("10:54:27", "3_0711-082757_02015", 0),  // set first day 
                new Row("11:00:38", "3_0711-082757_02016", 0),
                new Row("11:07:28", "3_0711-082757_02017", 0))


def setDayFP(rows: List[Row]): List[Row] = {

  var watermarkTime = 0

  for (row <- rows) {

    val newTime = row.time.replaceAll(":","").toInt 

    if (watermarkTime < newTime) {
      watermarkTime = newTime
      row.day = 1
    }  else  {
      row.day = 2
    }
  }
  return rows 
}

Here is the result (sorted by day and name):

Row(09:18:52,3_0711-082757_01001,1)
Row(09:23:18,3_0711-082757_01002,1)
Row(09:33:43,3_0711-082757_01004,1)
Row(10:20:00,3_0711-082757_01011,1)
Row(10:54:27,3_0711-082757_02015,1)
Row(11:00:38,3_0711-082757_02016,1)
Row(11:07:28,3_0711-082757_02017,1)
Row(05:03:38,3_0711-082757_02001,2)
Row(05:07:51,3_0711-082757_02002,2)
Row(05:13:02,3_0711-082757_02003,2)
Row(05:19:16,3_0711-082757_02004,2)

I'm looking for a better solution. Thanks in advance!

Upvotes: 1

Views: 308

Answers (5)

tgr
tgr

Reputation: 3608

I tried a bit and I think the foldLeft is not the way to go here. Here is my solution:

def sdfp(rows: List[Row]): List[Row] = {
  var watermark = 0
  rows.map(row => {
    val time = row.time.replaceAll(":","").toInt
    if(watermark < time) {
      watermark = time
      row.day = 1
    } else {
      row.day = 2
    }
    row
  }
}

map takes a function and applies it to all elements of the collection.

Edit

I found another way to solve this problem:

def sdfp2(rows: List[Row]): List[Row] = {
  var watermark = 0
  for {
    row <- rows
    val time = row.time.replaceAll(":","").toInt
    val after = watermark < time
  } yield {
    if(after) { watermark = row; row.day = 1 } else row.day = 2
    row
  }
}

Upvotes: 1

Eastsun
Eastsun

Reputation: 18859

Here is my solution:

case class Row(time: String, name: String, day: Int)

def setDay(rs: List[Row], wt: String = ""): List[Row] = rs match {
  case x::xs => if(wt < x.time) x.copy(day=1) :: setDay(xs,x.time)
                else x.copy(day=2) :: setDay(xs,wt)
  case Nil => Nil
}

Use it as setDay(rows).

Upvotes: 2

xiefei
xiefei

Reputation: 6599

Just for fun:

case class Row(time: String, name: String, day: Int) {
  def timeAsInt = time.replaceAll(":", "").toInt
}

val rows = List(new Row("09:18:52", "3_0711-082757_01001", 0),
                new Row("09:23:18", "3_0711-082757_01002", 0),
                new Row("09:33:43", "3_0711-082757_01004", 0),
                new Row("10:20:00", "3_0711-082757_01011", 0),
                new Row("05:03:38", "3_0711-082757_02001", 0),  // set secound day
                new Row("05:07:51", "3_0711-082757_02002", 0),
                new Row("05:13:02", "3_0711-082757_02003", 0),
                new Row("05:19:16", "3_0711-082757_02004", 0),
                new Row("10:54:27", "3_0711-082757_02015", 0),  // set first day 
                new Row("11:00:38", "3_0711-082757_02016", 0),
                new Row("11:07:28", "3_0711-082757_02017", 0))


def setDayFP(rows: List[Row]): List[Row] = {
  lazy val rowStream:Stream[Row] = for((row, watermarkTime) <- rows.toStream.zip(watermarkStream)) yield
    row.copy(day = if(watermarkTime < row.timeAsInt) 1 else 2) 


  lazy val watermarkStream:Stream[Int] = 0 #:: rowStream.zip(watermarkStream).map{ case (row, watermarkTime) =>
    math.max(watermarkTime, row.timeAsInt)
  }

  rowStream.toList
}

setDayFP(rows)

Upvotes: 2

Edmondo
Edmondo

Reputation: 20080

Depending on how functional you want to go, you have at least two approaches:

The first solution is using foldLeft on a collection of mutable data structures, which is not the most functional way. This is what you are trying to do in imperative style, and I do not see why you need to return the list at the end of setDayFP since you do not change it.

  def foldLeftExample(rows:List[Row]) = {
    rows.foldLeft(0){
      case(previousWaterMarkTime,currentRow) => 
        val newTime = currentRow.time.replaceAll(":","").toInt
        if(previousWaterMarkTime<newTime){
          currentRow.day=1
          newTime
        }
        else{
          currentRow.day=2
          previousWaterMarkTime
        }
    }
  }

When folding, you pass an initial value of type A to the collection and a function which takes a Tuple[A,YourCollectionType] and returns A. If you refactor it, you can see that it would look even better:

/**
   * Updates the row and returns the new waterMark time
   *
   */
  def updateRow(row:Row,previousWaterMarkTime:Int):Int = {
    val newTime = currentRow.time.replaceAll(":","").toInt
    if(previousWaterMarkTime<newTime){
      currentRow.day=1
      newTime
    }
    else{
      currentRow.day=2
      previousWaterMarkTime
    }
  }

  def foldLeftExample(rows:List[Row]) = {
    rows.foldLeft(0){
      case(previousWaterMarkTime,currentRow) => updateRow(currentRow,previousWaterMarkTime)
    }
  }

As I said, however, this is not the best approach, because you are mutating currentRow.

The best approach is more complex to understand for a new to FP.. and oops ! Someone has posted it before me. Take my answer as an intermediate version between yours, and the exact version from mhs

Upvotes: 0

Malte Schwerhoff
Malte Schwerhoff

Reputation: 12852

The solution below

  • has an immutable list class and uses copying instead of field updates

  • uses foldLeft instead of a loop

  • uses a tuple as the fold's aggregator to carry around the watermarkTime

However, I personally would stick with your for-loop and the var watermarkTime, since the foldLeft doesn't make it more readable. I would keep the day: Option[Int] and the copying, though.

case class Row(time: String, name: String, day: Option[Int] = None)

val rows = List(new Row("09:18:52", "3_0711-082757_01001"),
                new Row("09:23:18", "3_0711-082757_01002"),
                new Row("09:33:43", "3_0711-082757_01004"),
                new Row("10:20:00", "3_0711-082757_01011"),
                new Row("05:03:38", "3_0711-082757_02001"),
                new Row("05:07:51", "3_0711-082757_02002"),
                new Row("05:13:02", "3_0711-082757_02003"),
                new Row("05:19:16", "3_0711-082757_02004"),
                new Row("10:54:27", "3_0711-082757_02015"),
                new Row("11:00:38", "3_0711-082757_02016"),
                new Row("11:07:28", "3_0711-082757_02017"))


def setDayFP(rows: List[Row]): List[Row] = {
  rows.foldLeft((0, List[Row]()))
               {case ((watermarkTime, resultRows), row) =>

    val newTime = row.time.replaceAll(":","").toInt

    val (newWatermarkTime, day) = 
      if (watermarkTime < newTime)
        (newTime, 1)
      else
        (watermarkTime, 2)

    (newWatermarkTime, row.copy(day = Some(day)) :: resultRows)
  }._2.reverse
}

setDayFP(rows).foreach(println)

Upvotes: 3

Related Questions