Pankaj Rawat
Pankaj Rawat

Reputation: 4573

How to get Azure EventHub Depth

My EventHub has millions of messages ingestion every day. I'm processing those messages from Azure Function and printing offset and squence number value in logs.

public static async Task Run([EventHubTrigger("%EventHub%", Connection = "EventHubConnection", ConsumerGroup = "%EventHubConsumerGroup%")]EventData eventMessage,
        [Inject]ITsfService tsfService, [Inject]ILog log)
    {
log.Info($"PartitionKey {eventMessage.PartitionKey}, Offset {eventMessage.Offset} and SequenceNumber {eventMessage.SequenceNumber}");
}

Log output

PartitionKey , Offset 78048157161248 and SequenceNumber 442995283

Questions

  1. PartitionKey value blank? I have 2 partitions in that EventHub

  2. Is there any way to check backlogs? Some point of time I want to get how many messages my function need to process.

Upvotes: 4

Views: 890

Answers (2)

Connie Yau
Connie Yau

Reputation: 715

PartitionKey value blank? I have 2 partitions in that EventHub

The partition key is not the same as the partition ids. When you publish an event to Event Hubs, you can set the partition key. If that partition key is not set, then it will be null when you go to consume it.

Partition key is for events where you don't care what partition it ends up in, just that you want events with the same key to end up in the same partition.

An example would be if you had hundreds of IoT devices transmitting telemetry data. You don't care what partition these IoT devices publish their data to, as long as it always ends up in the same partition. You may set the partition key to the serial number of the IoT device. When that device publishes its event data with that key, the Event Hubs service will calculate a hash for that partition key, map it to a specific Event Hub partition, and will route any events with that key to the same partition.

The documentation from "Event Hubs Features: Publishing an Event" depicts it pretty well.

partition key

Upvotes: 0

Dylan Morley
Dylan Morley

Reputation: 1726

Yes, you can include the PartitionContext object as part of the signature, which will give you some additional information,

public static async Task Run([EventHubTrigger("HubName", 
    Connection = "EventHubConnectionStringSettingName", 
    ConsumerGroup = "Consumer-Group-If-Applicable")] EventData[] messageBatch, PartitionContext partitionContext, ILogger log)

Edit your host.json and set enableReceiverRuntimeMetric to true, e.g.

"version":  "2.0",
"extensions": {
    "eventHubs": {
        "batchCheckpointFrequency": 100,
        "eventProcessorOptions": {
            "maxBatchSize": 256,
            "prefetchCount": 512,
            "enableReceiverRuntimeMetric": true
        }            
    }
}

You now get access to RuntimeInformation on the PartitionContext, which has some information about the LastSequenceNumber, and your current message has it's own sequence number, so you could use the difference between these to calculate a metric, e.g something like,

public class EventStreamBacklogTracing
{
    private static readonly Metric PartitionSequenceMetric = 
        InsightsClient.Instance.GetMetric("PartitionSequenceDifference", "PartitionId", "ConsumerGroupName", "EventHubPath");

    public static void LogSequenceDifference(EventData message, PartitionContext context)
    {
        var messageSequence = message.SystemProperties.SequenceNumber;
        var lastEnqueuedSequence = context.RuntimeInformation.LastSequenceNumber;

        var sequenceDifference = lastEnqueuedSequence - messageSequence;

        PartitionSequenceMetric.TrackValue(sequenceDifference, context.PartitionId, context.ConsumerGroupName,
            context.EventHubPath);
    }
}

I wrote an article on medium that goes into a bit more detail and show how you might consume the data in grafana,

https://medium.com/@dylanm_asos/azure-functions-event-hub-processing-8a3f39d2cd0f

Upvotes: 3

Related Questions