Surender Raja
Surender Raja

Reputation: 3609

In scala How do we find the latest record for each Customer?

My input file is below . It contains some purchase details for each Customer.

Input:

 100,Surender,2015-01-23,PHONE,20000
 100,Surender,2015-01-24,LAPTOP,25000
 101,Ajay,2015-02-21,LAPTOP,40000
 101,Ajay,2015-03-10,MUSIC_SYSTEM,50000
 102,Vikram,2015-07-20,WATCH,60000

My requirement is I would like to find out the latest Purchase details for each Customer .

So the expected output is

Expected OutPut:

List(101,Ajay,2015-03-10,MUSIC_SYSTEM,50000)
List(100,Surender,2015-01-24,LAPTOP,25000)
List(102,Vikram,2015-07-20,WATCH,60000)

I tried the below code and it is giving me the expected output..

But this below logic is somewhat similar to java .

My Scala code :

 package pack1
 import scala.io.Source
 import scala.collection.mutable.ListBuffer
 object LatestObj {

 def main(args:Array[String])=
 {
    var maxDate ="0001-01-01"
    var actualData:List[String] =List()
    var resultData:ListBuffer[String] = ListBuffer()

    val myList=Source.fromFile("D:\\Scala_inputfiles\\records.txt").getLines().toList;
    val myGrped = myList.groupBy { x => x.substring(0,3) }
//println(myGrped)
     for(mappedIterator <- myGrped)
       {
         // println(mappedIterator._2)
            actualData =mappedIterator._2
            maxDate=findMaxDate(actualData)
            println( actualData.filter { x => x.contains(maxDate) })
       }


 }

   def findMaxDate( mytempList:List[String]):String =
   {
         var maxDate ="0001-01-01"
            for(m <- mytempList)
              {
                var transDate= m.split(",")(2)
                if(transDate > maxDate)
                {
                    maxDate =transDate
                }
     }

  return maxDate
  }

   }

Could some one help me on trying the same approach in a simpler way using scala?

Or The above code is the only way to achieve that logic?

Upvotes: 2

Views: 1645

Answers (2)

The Archetypal Paul
The Archetypal Paul

Reputation: 41769

Even simpler version, also using a case class with coincidentally the same name. Doesn't remove bad records like Tzach's, though, and I leave everything as String.

case class Record(id: String, name: String, dateString: String, item: String, count: String)
  myList.map { line =>
    val Array(id, name, dateString, item, count) = line.split(",")
    Record(id, name, dateString, item, count)
  }
  .groupBy(_.id)
  .map(_._2.maxBy(_.dateString))
  .toList  

Upvotes: 2

Tzach Zohar
Tzach Zohar

Reputation: 37852

Here's a simple version using groupBy and reduce, plus using a convenient case class to elegantly represent records:

case class Record(id: Int, username: String, date: Date, product: String, cost: Double)

val dateFormat: SimpleDateFormat = new SimpleDateFormat("yyyy-MM-dd")
val stringList = Source.fromFile("./records.txt").getLines().toList

// split by comma and parse into case class - while REMOVING bad records
val records = stringList.map(_.split(",")).collect {
  case Array(id, username, date, product, cost) => Record(id.toInt, username, dateFormat.parse(date), product, cost.toDouble)
}

// group by key, and reduce each group to latest record
val result = records.groupBy(_.id).map { _._2.reduce {
  (r1: Record, r2: Record) => if (r1.date.after(r2.date)) r1 else r2
}}

result.foreach(println)
// prints:
// Record(101,Ajay,Tue Mar 10 00:00:00 IST 2015,MUSIC_SYSTEM,50000.0)
// Record(100,Surender,Sat Jan 24 00:00:00 IST 2015,LAPTOP,25000.0)
// Record(102,Vikram,Mon Jul 20 00:00:00 IDT 2015,WATCH,60000.0)

Note that this implementation does not make any use of mutable variables or collections, which often simplifies the code significantly, and is considered more idiomatic for functional languages like Scala.

Upvotes: 2

Related Questions