Alok Kumar Singh
Alok Kumar Singh

Reputation: 2569

Metric to use for job duration in prometheus? Gauge or Summary?

I have a task which does 5 things. I want to measure the time taken in each of these 5 things using Prometheus. Also this task are run at fixed interval of 30minutes.

The job has the following labels: consumergroup, topic

What is the Metric type i should use to measure this job total time and also all of the 5 things in it.? I want this so that i can have some data to figure out which consumergroup/topic needs the optimization. Later will also alert on them.

The job is a long running job. Each tasks take minutes to complete. Summary right? Will it give me all the details over time to visualize it and optimize.

Upvotes: 0

Views: 1239

Answers (1)

Alok Kumar Singh
Alok Kumar Singh

Reputation: 2569

Using Histogram, helped me plot this.

rsk_loader_seconds_sum{consumergroup=~"$consumergroup", topic=~"$topic", sink_group="$sinkgroup"}/rsk_loader_seconds_count{consumergroup=~"$consumergroup", topic=~"$topic", sink_group="$sinkgroup"}

enter image description here

    durationMetric = prometheus.NewHistogramVec(
        prometheus.HistogramOpts{
            Namespace: "rsk",
            Subsystem: "loader",
            Name:      "seconds",
            Help:      "total time taken to load data in Redshift in seconds",
            Buckets:   buckets,
        },
        []string{"consumergroup", "topic", "sink_group"},
    )
    copyStageMetric = prometheus.NewHistogramVec(
        prometheus.HistogramOpts{
            Namespace: "rsk",
            Subsystem: "loader",
            Name:      "copystage_seconds",
            Help:      "time taken to create staging table and load data in it in seconds",
            Buckets:   buckets,
        },
        []string{"consumergroup", "topic", "sink_group"},
    )
    deDupeMetric = prometheus.NewHistogramVec(
        prometheus.HistogramOpts{
            Namespace: "rsk",
            Subsystem: "loader",
            Name:      "dedupe_seconds",
            Help:      "time taken to de duplicate table in staging in seconds",
            Buckets:   buckets,
        },
        []string{"consumergroup", "topic", "sink_group"},
    )
    deleteCommonMetric = prometheus.NewHistogramVec(
        prometheus.HistogramOpts{
            Namespace: "rsk",
            Subsystem: "loader",
            Name:      "deletecommon_seconds",
            Help:      "time taken to delete common in seconds",
            Buckets:   buckets,
        },
        []string{"consumergroup", "topic", "sink_group"},
    )
    deleteOpStageMetric = prometheus.NewHistogramVec(
        prometheus.HistogramOpts{
            Namespace: "rsk",
            Subsystem: "loader",
            Name:      "deleteop_seconds",
            Help:      "time taken to delete rows with operations delete in seconds",
            Buckets:   buckets,
        },
        []string{"consumergroup", "topic", "sink_group"},
    )
    copyTargetMetric = prometheus.NewHistogramVec(
        prometheus.HistogramOpts{
            Namespace: "rsk",
            Subsystem: "loader",
            Name:      "copytarget_seconds",
            Help:      "time taken to copy to target table from staging table",
            Buckets:   buckets,
        },
        []string{"consumergroup", "topic", "sink_group"},
    )

Upvotes: 0

Related Questions