Reputation: 179
I have an Azure Event Hub Namespace based on a Standard tier with 17 TU's, which can also auto-inflate up to 40 TU's. It has 1 Event Hubs instance with 12 partitions.
This EH receives 2400 messages per second which is 5.6 MB/second. 100% of this input is currently consumed by other clients that I don't know. I don't know how partitioning is done, nor I can control it. I just know that the EH has 12 partitions, but I don't care because I created another consumer group to send another 100% of its incoming data to Azure Stream Analytics.
Concerning the message size, we have 13.8TB/6.08Billion = 3 KB per message, which is a quite complex json payload like this, where my goal is to extract and write into an Azure DataLake v2 only the records where "my_parameter" is included as a nested json field of the payload.
The records which match this rule are about 1/50, so this means that I should write about 50 messages per second, for a total size of 400KB/second in parquet format.
As suggested by the documentation, we started setting up 6 SU. As you can see, ASA in unable to handle the incoming flow; this is clearly shown - by the watermark, that starts growing minute by minute, after 29 minutes accumulated 20 minutes delay; - By the SU % utilization, which is constantly growing - Growing Backlogged Input events
The nr of incoming and outgoing (going into the DataLake) events is simply aligned to the other numbers above.
To improve this situation, I followed this article to re-shuffle or repartition) the 12 EH partitions, which I can't control. I didn't change the 6 ASA SU's (= same cost) but I forced 2 partitions to be 2, 6, 12 with this technique:
It seems that the timing improves a bit (although I'm not sure that 1 hour tests in enough), however the main problem remains.
After many tests, I had to move up to 32 SU's to be able to support the workload and sometimes even recover the past messages still waiting in the event hub instance. This is the result using 36 SU's in a shared cluster, but I verified that I get the same result even using a dedicated cluster with 36 SU's.
In this case I started ASA after 37 minutes interruption, in fact we see the watermark initially at 37 and then slowly decreasing. This means that ASA is getting the oldest events first, which it's able to process quickly than the production speed.
Concerning the other parameters, we see the CPU is close to 90% (which I don't care, right?), while SU% utilization is quite small (about 10%) and stable even if at the beginning it's managing running more quickly than production becuase it has to recover 37 minutes when it didn't work.
So, this looks like the solution to my problem. But I'm wondering why I should pay 36 units (theoretically able to process 36 MB/second), while I need to process just 6.7 MB/second? It's 6 times more expensive this way!
My question: is this the right approach to identify the right sizing? How can I justify a so high price for something that should cost 1 sixth?
Upvotes: 1
Views: 577
Reputation: 842
We took the conversation offline as it was too specific, but here is some general guidance to approach that question.
You have a (mostly) passthrough query, from EH to Blob/Parquet, you should be in the embarrassingly parallel mode. You can you check that by making sure you can scale that job up to 72 SU (12 partitions * 6 SU max per partition).
If not, you may have a misalignment between your input and output configurations, specifically the partition key in the input and the path pattern in the output. If that’s the case the job has to shuffle data and you will take a performance hit.
Your metrics make sense though. You have high CPU util but low SU (aka memory) usage. It looks like the job is indeed spending time unnesting the payload.
To identify what's the costly component here:
Unless we find a reason in the above, to be honest this is one of the use case where ASA may not be the best tool for the job. We are using resources to deserialize/serialize records, apply dynamic typing, and optimize for heavy duty processing/analytics… when you only need to write the records to disk. This can translates in wasted cycles and a cost/performance ratio that is not the best for that specific application.
Upvotes: 0