abhishek
abhishek

Reputation: 373

Pipeline processing with Azure Event hubs and Azure functions

I am new to azure event hubs and trying to see is there an option for implementing pipeline processing (Chain of components where output of first goes to next in chain) with EventHubs/azure Functions?

i have a stream of data coming over event hub, i have some set of components, where each component performs a specific function and passes the result to next component. Next component might need both the event original data and result generated from previous component(can have sequencing dependency). Is event hub/azure functions suitable for such scenarios? What i understood is each consumer gets its own copy of event, so it can't do a pipelining of consumers.

what is the best way to design pipelining of components(so that new components can be added in pipeline when needed) over large stream of events in azure? Or the only option in azure even hub is to have a single consumer creates the complete processing pipeline of components(which will be somewhat coupled). I don't want to use different event hubs between each component(even though completely decoupled, but too many event hubs to manage)

Upvotes: 1

Views: 1117

Answers (1)

Aravind
Aravind

Reputation: 4163

Eventhubs form the data ingestion end of a data/event pipeline. From there data is then processed, analysed(real time) and stored and possibly categorized and historical data is analyzed later. This forms a typical pipeline. You are looking to apply a series of different types of processing using different components/rule engines etc on every message that is ingested to the event hub.

Options I can think of are

1) EventProcessorHost - in here you can write your own custom code ( components one after another) to receive messages from all or specific partitions of the event hub in an asynchronous fashion. This involves wiring up all your logic in code. So if you need to introduce a new component means it will involve code change and new deployment.

2) Based on your point you can also have multiple intermediate event hubs and do the processing but that may get expensive. You may want to look at service bus queues as the intermediate points to load messages for subsequent processing and it will be a cheaper option.

3) Logic apps give you a workflow orchestration type model. In this case you can have events from eventhubs as a trigger to kick-off the workflow. but am not sure if this will suit your requirement completely and also there may be scaling and performance question marks if the amount of data ingested is high.

Upvotes: 1

Related Questions