Reputation: 31
I have this document which I want to read and group the file by employee designation and department and find the average salary. Following is the code I used. I used map. How do I implement it using group by.
import scala.io.Source
object Problem {
case class Employee(empId: String,
designation: String,
age: Int,
salary: Long,
department: Int)
def main(arrg:Array[String]){
var a = Source.fromFile("someFile.txt").
getLines().
map( _.split(",") ).
map( l => ((l(1)+l(4)),l(3)) ).
mapValues( _.map( _.salary ).sum/_.map.size )
print(a)
}
}
Upvotes: 0
Views: 2642
Reputation: 16308
Let me share some code from my private utils stash
given
libraryDependencies ++= Seq(
"com.chuusai" %% "shapeless" % "2.2.3",
"org.scalaz" %% "scalaz-core" % "7.1.1",
"org.typelevel" %% "scalaz-spire" % "0.2",
"com.github.melrief" %% "purecsv" % "0.0.2")
in the build.sbt
This import prefix:
import purecsv.safe._
import shapeless.tag.Tagger
import scala.{util => ut}
import scalaz._
import Scalaz._
import spire.implicits._
import shapeless._
import shapeless.syntax.singleton._
import ops.hlist.{Selector, RightFolder}
This handful of utils:
trait CorrespondingLow extends Poly2 {
implicit def drop[E, L <: HList, L2 <: HList] = at[E, (L, Tagger[L2])] { case (_, (l, aux)) => (l, aux) }
}
object CorrespondingFolder extends CorrespondingLow {
implicit def take[E, L <: HList, L2 <: HList]
(implicit sel2: Selector[L2, E]) = at[E, (L, Tagger[L2])] { case (e, (l, aux)) => (e :: l, aux) }
}
class corresponding[R2] {
def move[R1, L1 <: HList, L2 <: HList, L2A <: HList]
(rec: R1)
(implicit lgen1: LabelledGeneric.Aux[R1, L1],
lgen2: LabelledGeneric.Aux[R2, L2],
rf: RightFolder.Aux[L1, (HNil, Tagger[L2]), CorrespondingFolder.type, (L2A, Tagger[L2])],
lgen2a: LabelledGeneric.Aux[R2, L2A]): R2 =
lgen2a.from(lgen1.to(rec).foldRight((HNil: HNil, tag[L2]))(CorrespondingFolder)._1)
}
object corresponding {
def apply[R2] = new corresponding[R2]
}
implicit class TryOps[T](t: ut.Try[T]) {
def toValidation: ValidationNel[Throwable, T] = t match {
case ut.Success(v) => v.success
case ut.Failure(ex) => ex.failureNel
}
}
And your model:
case class Employee(empId: String,
designation: String,
age: Int,
salary: Long,
department: Int)
case class Group(designation: String, department: Int)
We could easily write:
val file = getClass.getResource("employees.csv").getFile
val employees: ValidationNel[Throwable, Seq[Employee]] =
CSVReader[Employee]
.readCSVFromFileName(file)
.traverseU(_.toValidation)
val averageSalary = (_: Seq[Employee])
.groupBy(emp => corresponding[Group].move(emp))
.mapValues {_
.map(emp => BigDecimal(emp.salary))
.qmean
}
println(employees map averageSalary)
And get your grouped output.
Upvotes: 0
Reputation: 15141
Just groupBy
a tuple:
Source.fromFile("someFile.txt").
getLines().
map( _.split(",") ).
toSeq.
map(data => Employee(data(0), data(1), data(2).toInt, data(3).toLong, data(4).toInt)).
groupBy(emp => (emp.designation, emp.department)).
mapValues(emp => emp.map(_.salary).sum / emp.length )
Upvotes: 0
Reputation: 9820
You can group by a tuple :
val employees = List(
Employee("id", "des", 30, 1000, 1),
Employee("id", "des2", 35, 1500, 1),
Employee("id", "des", 40, 2000, 1)
)
employees
.groupBy(e => (e.designation, e.department))
.mapValues(emps => emps.map(_.salary).sum / emps.length)
// Map((des,1) -> 1500, (des2,1) -> 1500)
Upvotes: 1