Reputation: 137
I have started learning scala and I have tryed to solve a scenario as below. I have an input file with multiple transactions separated by ','. Below are my sample values:
transactionId, accountId, transactionDay, category, transactionAmount
A11,A45,1,SA,340
A12,A2,1,FD,567
and I have to calculate the total transaction value for all transactions for each day along with other statistics. Below is my initial snippet
import scala.io.Source
val fileName = "<path of input file>"
Transaction(
transactionId: String, accountId: String,
transactionDay: Int, category: String,
transactionAmount: Double)
Source.fromFile(fileName).getLines().drop(1)
val transactions: List[Transaction] = transactionslines.map { line =>
val split = line.split(',') Transaction(split(0), split(1), split(2).toInt, split(3), split(4).toDouble) }.toList
Upvotes: 0
Views: 124
Reputation: 895
You can do it as below:
val sd=transactions.groupBy(_.transactionDay).mapValues(_.map(_.transactionAmount).sum)
Further ,you can do complex analytics by converting it into a dataframe.
val scalatoDF = spark.sparkContext.parallelize(transactions).toDF("transactionId","accountId","transactionDay","category","transactionAmount")
scalatoDF.show()
Hope this helps!
Upvotes: 2