Durga
Durga

Reputation: 137

Read a csv file using scala and generate analytics

I have started learning scala and I have tryed to solve a scenario as below. I have an input file with multiple transactions separated by ','. Below are my sample values:

transactionId, accountId, transactionDay, category, transactionAmount

A11,A45,1,SA,340
A12,A2,1,FD,567

and I have to calculate the total transaction value for all transactions for each day along with other statistics. Below is my initial snippet

import scala.io.Source 
val fileName = "<path of input file>" 
Transaction( 
  transactionId: String, accountId: String, 
  transactionDay: Int, category: String, 
  transactionAmount: Double)   
Source.fromFile(fileName).getLines().drop(1) 
val transactions: List[Transaction] = transactionslines.map { line => 
  val split = line.split(',') Transaction(split(0), split(1), split(2).toInt, split(3), split(4).toDouble) }.toList

Upvotes: 0

Views: 124

Answers (1)

Subash
Subash

Reputation: 895

You can do it as below:

val sd=transactions.groupBy(_.transactionDay).mapValues(_.map(_.transactionAmount).sum)

Further ,you can do complex analytics by converting it into a dataframe.

val scalatoDF = spark.sparkContext.parallelize(transactions).toDF("transactionId","accountId","transactionDay","category","transactionAmount")

scalatoDF.show()

Hope this helps!

Upvotes: 2

Related Questions