Reputation: 1622
I have to monitor a log file in which is written the history of utilization of an app. This log file is formatted in this way:
<AppId,date,cpuUsage,memoryUsage>
<AppId,date,cpuUsage,memoryUsage>
<AppId,date,cpuUsage,memoryUsage>
<AppId,date,cpuUsage,memoryUsage>
<AppId,date,cpuUsage,memoryUsage>
... about 800000 rows
AppId
is always the same, because is referenced at only one app, date
is expressed in this format dd/mm/yyyy hh/mm
cpuUsage
and memoryUsage
are expressed in %
so for example:
<3ghffh3t482age20304,230720142245,0.2,3,5>
To be specific, I have to check the percentage of CPU usage and memory usage by this application to be monitored using spark and the map reduce algorithm.
My output is to print alert when the cpu or the memory are 100% of usage.
How can I start?
Upvotes: 0
Views: 704
Reputation: 5173
The idea is to declare a class and map the line into a scala object,
Lets declare the case class as follows,
case class App(name: String, date: String, cpuUsage: Double, memoryusage: Double)
Then initialize the SparkContext and create a RDD from the text file where the data is present,
val sc = new SparkContext(sparkConf)
val inFile = sc.textFile("log.txt")
then parse each line and map it to App object so that the range checking would be faster,
val mappedLines = inFile.map(x => (x.split(",")(0), parse(x)))
where the parse(x) method is defined as follows,
def parse(x: String):App = {
val splitArr = x.split(",");
val app = new App(splitArr(0),
splitArr(1),
splitArr(2).toDouble,
splitArr(3).toDouble)
return app
}
Note that i have assumed the input as follows, (this is just to give you the idea and not the entire program),
ffh3t482age20304,230720142245,0.2,100.5
Then do the filter transformation where you can perform the check and report the anamoly conditions,
val anamolyLines = mappedLines.filter(doCheckCPUAndMemoryUtilization)
anamolyLines.count()
where doCheckCPUAndMemoryUtilization function is defined as follows,
def doCheckCPUAndMemoryUtilization(x:(String, App)):Boolean = {
if(x._2.cpuUsage >= 100.0 ||
x._2.memoryusage >= 100.0) {
System.out.println("App name -> "+x._2.name +" exceed the limit")
return true
}
return false
}
Note: This is only a batch processing and not real-time processing.
Upvotes: 1