Reputation: 1244
I'm setting up an Apache Spark
cluster to perform realtime streaming computations and would like to monitor the performance of the deployment by tracking various metrics like sizes of batches, batch processing times, etc. My Spark Streaming
program is written in Scala
Questions
Thanks in advance,
Upvotes: 2
Views: 1732
Reputation: 2345
If you have no luck with 1., this will help with 2.:
ssc.addStreamingListener(new JobListener());
// ...
class JobListener implements StreamingListener {
@Override
public void onBatchCompleted(StreamingListenerBatchCompleted batchCompleted) {
System.out.println("Batch completed, Total delay :" + batchCompleted.batchInfo().totalDelay().get().toString() + " ms");
}
/*
snipped other methods
*/
}
Taken from In Spark Streaming, is there a way to detect when a batch has finished?
batchCompleted.batchInfo()
contains:
numRecords
batchTime
, processsingStartTime
, processingEndTime
schedulingDelay
outputOperationInfos
Hopefully you can get what you need from those properties.
Upvotes: 4