Vasanth Subramanian
Vasanth Subramanian

Reputation: 1361

NiFi: Calculate a processor data flow over a period of time (> 5 mins)

Looking for data flow statistics for the recent one week of time (bytesIn, bytesOut). Using NiFi REST API endpoint [GET] /nifi-api/processors/{id}, got below statistics for last five minutes. Is there any existing API end-point to retrieve data flow statistics for one week of time?

{
    "id": "1234aa-1234-1f23-1f23-123456ed51f1a",
    "status": {
        "name": "MyConsumeKafkaProcessor",
        "runStatus": "Running",
        "statsLastRefreshed": "14:39:55 EDT",
        "aggregateSnapshot": {
            "type": "ConsumeKafka_0_10",
            "runStatus": "Running",
            "executionNode": "ALL",
            "bytesRead": 0,
            "bytesWritten": 23948016,
            "read": "0 bytes",
            "written": "22.84 MB",
            "flowFilesIn": 0,
            "bytesIn": 0,
            "input": "0 (0 bytes)",
            "flowFilesOut": 2188,
            "bytesOut": 23948016,
            "output": "2,188 (22.84 MB)",
            "taskCount": 1179,
            "tasksDurationNanos": 15974094510,
            "tasks": "1,179",
            "tasksDuration": "00:00:15.974",
            "activeThreadCount": 0,
            "terminatedThreadCount": 0
        }
    }
}

Upvotes: 0

Views: 752

Answers (1)

Up_One
Up_One

Reputation: 5271

I had a similar issue, trying to see what comes in over a period of time from Kafka.

I used counter with a frequency of 1 min.

so is like this :

1 - ConsumeKafka

2 - Capture Records out of Kafka

3 - UpdateCounter IN (as a cloned success connection) - the deltas will be records count oof the kafka payload

4 - do you stuff with the data (enrich/change bla bla bla)

5 - capture records count before persisting data

6 - Update OUT counter

7 - Persists the data in (DB/S3/ etc)

I then have flow that interogates the https://${hostname(true)}:8443/nifi-api/counters every 60 sec.

I land thiis data in monitoring db repo.

I use this to messue data delivery IN/OUT of NiFi and look for dropouts, thruput,etc.

I do the same with my Source Data, where in Kafka case i capture the num of msg generated evey min.

Upvotes: 1

Related Questions