LeLuc
LeLuc

Reputation: 415

Scala - Get column with condition and groupBy?

I have the following class:

case class worker(
                    age: Int,
                    workclass: String,
                    education: String,
                    educationNum: Int,
                    maritalStatus: String,
                    occupation: String,
                    relationship: String,
                    race: String,
                    sex: String,
                    capitalGain: Int,
                    capitalLoss: Int,
                    hoursPerWeek: Int,
                    nativeCountry: String,
                    income: String
                  )

I want to write a function that returns the workclass that has the highest number of observations with an income > 50000.

I'm new to Scala, so I'm struggling, but I've tried this:

 def bestPayWork(c: Seq[worker]): String = {
    var highSalaryGrouped = c.filter(i => i.income > 50000).groupBy(i => i.workclass)
    var result = highSalaryGrouped.max("income", highSalaryGrouped)
    result
  }

Upvotes: 0

Views: 120

Answers (1)

michaJlS
michaJlS

Reputation: 2500

Here is sample solution assuming that by observations you mean number of workers. I changed returned type to Option[String] to cover the case when c is empty. Also you need to think what to do with income that's a string and you want to compare it to a number. I just casted it to int, that may not work in your case.

def bestPayWork(c: Seq[worker]): Option[String] = {
  val highSalaryGrouped = c.filter(i => i.income.toInt > 50000).groupBy(i => i.workclass).map {
    case (workclass, workers) => workclass -> workers.size
  }
  highSalaryGrouped.maxByOption { case (workclass, size) => size }.map(_._1)
}

Upvotes: 3

Related Questions