Severin Simko
Severin Simko

Reputation: 83

How to get the total number of records processed by Spark Streaming?

Does anyone know how does Spark compute its number of records (I think it is the same as the number of events in a batch), as displayed here?

enter image description here

I'm trying to figure out how I can get this value remotely (REST-API does not exist for Streaming option in the UI).

Basically what I'm trying to do it to get the total number of records processed by my application. I need this information for the web portal.

I tried to count the Records for each stage, but it gave me completely different number as it is at the picture above. Each stage contain the infomation about its records. As shown here

enter image description here

I'm using this short python script to count the "inputRecords", from each stage. This is the source code:

import json, requests, urllib
print "Get stages script started!"
#URL REST-API
url = 'http://10.16.31.211:4040/api/v1/applications/app-20161104125052-0052/stages/'
response = urllib.urlopen(url)
data = json.loads(response.read())

stages = []
print len(data)
inputCounter = 0
for item in data:
        stages.append(item["stageId"])
        inputCounter += item["inputRecords"]
print "Records processed: " + str(inputCounter)

If I understood it correctly: Each Batch has one Job, and each Job has multiple Stages, these Stages have multiple Tasks.

So for me it made sense to count the input for each Stage.

Upvotes: 2

Views: 4872

Answers (1)

maasg
maasg

Reputation: 37435

Spark offers a metrics endpoint on the driver:

<driver-host>:<ui-port>/metrics/json

A Spark Streaming application will report all metrics available in the UI and some more. The ones you are potentially looking for are:

<driver-id>.driver.<job-id>.StreamingMetrics.streaming.totalProcessedRecords: {
value: 48574640
},
<driver-id>.driver.<job-id>.StreamingMetrics.streaming.totalReceivedRecords: {
value: 48574640
}

This endpoint can be customized. See Spark Metrics for info.

Upvotes: 5

Related Questions