Zekokhan
Zekokhan

Reputation: 203

How to eliminate Azure Search Service API Throttling

Our goal is to completely eliminate azure search service throttling. Initially, we started off with 3 replicas and 1 partition on the S1 tier. We were getting a lot of throttling sometimes even up to 1.5% of requests were getting throttled. We took some measures to alleviate the problem:

1) - We started load testing the service and came up with a baseline req/sec for 3 replicas. Every time we hit ~37 req/sec our service would get throttling.

2)- We did not want our users to see errors and to alleviate the problem we implemented the exponential backoff transient fault policy that retries the call when Azure Search API returns a 5xx or 408 (Request Timeout) response. That worked well for us.

3) The problem still remained; we still get throttled at 37 req/sec which seems very low to us. This means we are roughly getting a MAX of ~12 req/sec per replica. So we performance tuned our queries(removed facets, high cardinality field from our index, cleaned up our field properties and make sure the index is doing the bare minimum) our queries got a little faster and not much of an effect on the throttling front.

4) So we decided to go up to FIVE replicas to get rid of throttling. We did load testing again and now the service can handle ~59 req/secs baseline. This again becomes ~12 requests/sec per replica

~12 requests/sec per replica seems like LOW capacity for a Standard tier server. This is a huge problem for us as our traffic is only going to increase (not to mention dealing with nasty bot traffic)

Do these benchmark numbers look right to the Azure Search Team?

Or are we doing something wrong? I can provide the search query if needed.

Any help would be much appreciated!

Thanks!

Upvotes: 2

Views: 950

Answers (2)

Zekokhan
Zekokhan

Reputation: 203

Thank you for the detailed response Mathew!

1) We have followed the strategies mentioned here: Deployment strategies and best practices for optimizing performance on Azure Search

2) We have thought about running the indexer in Off-peak hours but our use case needs us to run our index more frequently (set to run every 15 mins)

3) Yes, our query might be a bit complex.

Index size: 160K rows; Number of fields: 108

here is an example query from our landing page:

    "$count=false&facet=IsUsed,count:500&facet=Year,count:500&facet=ChassisMake,count:500&facet=ChassisModel,count:500&facet=NormalTrim,count:500&facet=CabType,count:500&facet=RoofHeight,count:500&facet=ChassisType,count:500&facet=DriveTrain,count:500&facet=RearWheels,count:500&facet=FuelType,count:500&facet=NormalEngine,count:500&facet=NormalTransmission,count:500&facet=NormalColor,count:500&facet=GVWR,count:500&facet=Wheelbase,count:500&facet=CA,count:500&facet=BodyType,count:500&facet=BodyMake,count:500&facet=HasSnowPlow,count:500&facet=HasCrane,count:500&facet=HasVanPartition,count:500&facet=BodyLength,count:500&facet=DealerNumericID,count:2000&$filter=((search.in(CMID, '5e3c3789-bb0f-4e6a-8c8b-a0fc31568d85') ) and ( HasLiftKit eq null )) and (IsDealerLive eq true) and  IsDemoDealer eq false  and  DepartureDate eq null and  IsUsed eq false  and geo.distance(GeoPoint, geography'POINT(-121.141636 38.666597)') le 80&queryType=simple&scoringParameter=IsUpfit-'true'&scoringParameter=GeoPoint-'-121.141636','38.666597'&scoringProfile=locator-distance&searchMode=any&$select=ID,DealerID,IsUsed,Featured,CustomTitle,StockNumber,CleanStockNumber,Vin,ChassisImagePathTemplate,ChassisBlobLastUpdated,BodyImagePathTemplate,BodyBlobLastUpdated,ChassisModelVINDecodingID,ChassisManufacturerID,BodyManufacturerID,BodyType_Code,ChassisMake,ChassisModel,DealerNumericID,Year,BodyTypeID,BodyType,EnabledAttributes,Mileage,CabType,DriveTrain,RearAxle,FuelType,Transmission,Color,RoofHeight,SalePrice,OnSale,SaleStartDate,SaleEndDate,SaleShowSaleBanner&$skip=0&$top=10
SearchString:*"

That query runs in 75 ms when the index is warmed up and in ~300 ms when the index is not warmed up.

Please let us what you think.

Thanks a bunch!

Upvotes: 0

Matthew Gotteiner
Matthew Gotteiner

Reputation: 405

Scale in Azure Search is a complex topic. I recommend the following:

  1. Review Deployment strategies and best practices for optimizing performance on Azure Search
  2. From Develop baseline numbers:

    Azure Search does not run indexing tasks in the background. If your service handles query and indexing workloads concurrently, take this into account by either introducing indexing jobs into your query tests, or by exploring options for running indexing jobs during off peak hours.

    If you are running indexing jobs concurrently while you are querying, consider runnning them off peak hours or increasing your scale even further

  3. You mention that you feel like the request rate is too low at 12 requests per second. It looks like you've already taken some of the steps listed on the performance optimization page, including removing high cardinality fields from your index. I suspect you have slow individual queries, would you consider increasing the number of partitions you have?

From Scaling for slow individual queries:

A partition is a mechanism for splitting your data across extra resources. Adding a second partition splits data into two, a third partition splits it into three, and so forth. One positive side-effect is that slower queries sometimes perform faster due to parallel computing. We have noted parallelization on low selectivity queries, such as queries that match many documents, or facets providing counts over a large number of documents. Since significant computation is required to score the relevancy of the documents, or to count the numbers of documents, adding extra partitions helps queries complete faster.

We can provide more detailed recommendations if you share your query, what an example record might look like, and how many records you have in your index.

Upvotes: 2

Related Questions