totalEstimatedMatches behavior with Microsoft (Bing) Cognitive search API (v5)

Question

Recently converted some Bing Search API v2 code to v5 and it works but I am curious about the behavior of "totalEstimatedMatches". Here's an example to illustrate my question:

A user on our site searches for a particular word. The API query returns 10 results (our page size setting) and totalEstimatedMatches set to 21. We therefore indicate 3 pages of results and let the user page through.

When they get to page 3, totalEstimatedMatches returns 22 rather than 21. Seems odd that with such a small result set it shouldn't already know it's 22, but okay I can live with that. All results are displayed correctly.

Now if the user pages back again from page 3 to page 2, the value of totalEstimatedMatches is 21 again. This strikes me as a little surprising because once the result set has been paged through, the API probably ought to know that there are 22 and not 21 results.

I've been a professional software developer since the 80s, so I get that this is one of those devil-in-the-details issues related to the API design. Apparently it is not caching the exact number of results, or whatever. I just don't remember that kind of behavior in the V2 search API (which I realize was 3rd party code). It was pretty reliable on number of results.

Does this strike anyone besides me as a little bit unexpected?

Rob Truxal · Accepted Answer

Turns out this is the reason why the response JSON field totalEstimatedMatches includes the word ...Estimated... and isn't just called totalMatches:

"...search engine index does not support an accurate estimation of total match."

Taken from: News Search API V5 paging results with offset and count

As one might expect, the fewer results you get back, the larger % error you're likely to see in the totalEstimatedMatches value. Similarly, the more complex your query is (for example running a compound query such as ../search?q=(foo OR bar OR foobar)&...which is actually 3 searches packed into 1) the more variation this value seems to exhibit.

That said, I've managed to (at least preliminarily) compensate for this by setting the offset == totalEstimatedMatches and creating a simple equivalency-checking function.

Here's a trivial example in python:

while True:
    if original_totalEstimatedMatches < new_totalEstimatedMatches:
       original_totalEstimatedMatches = new_totalEstimatedMatches.copy()

       #set_new_offset_and_call_api() is a func that does what it says.
       new_totalEstimatedMatches = set_new_offset_and_call_api()
    else:
        break

totalEstimatedMatches behavior with Microsoft (Bing) Cognitive search API (v5)

Answers (2)

Related Questions