djm.im
djm.im

Reputation: 3323

Bing Search API v7 Pagination

I work on Bing News API v7 integration. More precisely, I use https://api.cognitive.microsoft.com/bing/v7.0/news/search API endpoint.

I found some 'unexpected' behavior for paging. (Expected behavior is each page has the constant size).

On this page is explained how to How to page through results.

I follow that approach. I use 30 for page size; because of that, values for offset are 0, 30, 60, and so on.

For example, when using these parameters: query "Java 14", market "en-US", sort by date, and values for offset are 0, 30, 60, 90, 120, 150 (/bing/v7.0/news/search?q=Java 14&count=30&offset=0&mkt=en-US&sortBy=date).

I get six pages of results, and each contains less than 30 URLs per page.

Page: 0 Total: 27 results
Page: 1 Total: 26 results
Page: 2 Total: 26 results
Page: 3 Total: 29 results
Page: 4 Total: 29 results
Page: 5 Total: 7 results
...

This Stackoverflow What's the expected behavior of the Bing Search API v5 when deeply paginating? is related to Bing API v5. Paging values don't follow the fixed-size sequence, but the formula is previous result size + 1.

So, my question is: Which values should I use for offset for the second page (Page: 1)? Is it 28 or is it 30? And which value for the third page (Page 2): 54 or 60?

Upvotes: 0

Views: 972

Answers (1)

Rob
Rob

Reputation: 1216

Make a first pass to the api to determine totalEstimatedMatches. Divide totalEstimatedMatches / 25 or size of each page to get number of api calls to make. For example if totalEstimatedMatches = 100 then make 4 api calls which should return 25 urls each. I play it safe and reduce that by 1 but you could put that in a try catch. s.Count in this example will be 25. Solution in VB.Net but you get the idea.

        'the secret key 
        Dim accessKey As String = "xxxxxxxxxxxxxxxxxxxxxxxxx"
        Dim endpoint As String = "https://api.cognitive.microsoft.com/bing/v7.0/news/search?"

        Dim queryString = HttpUtility.ParseQueryString(String.Empty)
        queryString("q") = search_criteria 'Uri.EscapeDataString(search_criteria)
        queryString("mkt") = market
        queryString("count") = "25"
        queryString("offset") = "0"
        queryString("freshness") = freshness
        queryString("SafeSearch") = "strict"

        ' Construct the URI of the search request
        uriQuery = endpoint & queryString.ToString

        ' Perform the Web request and get the response
        request = HttpWebRequest.Create(uriQuery)
        request.Headers.Add("Ocp-Apim-Subscription-Key", accessKey)

        response = CType(request.GetResponseAsync.Result, HttpWebResponse)
        json = (New StreamReader(response.GetResponseStream)).ReadToEnd

        'create json object
        Dim converter = New ExpandoObjectConverter()
        Dim message As Object = JsonConvert.DeserializeObject(Of ExpandoObject)(json, converter)

        'get top level object and its sub objects
        s = message.value

        Try
            totalEstimatedMatches = CInt(message.totalEstimatedMatches)
            total_available_for_processing = s.Count
        Catch ex As Exception
        End Try

        'get total number of pages availble at 25 records per page, so we page thru 25 records at a time and then call api
        Dim page_count As Integer = totalEstimatedMatches / 25

        'loop thru page_count and 
        For p As Integer = 0 To page_count - 1

            If p = 0 Then
                queryString("count") = "25"
                queryString("offset") = "0"
            Else
                'determine offset
                queryString("count") = "25"
                queryString("offset") = p * 25
            End If

            ' Construct the URI of the search request
            uriQuery = endpoint & queryString.ToString

            ' Perform the Web request and get the response
            request = HttpWebRequest.Create(uriQuery)
            request.Headers.Add("Ocp-Apim-Subscription-Key", accessKey)

            response = CType(request.GetResponseAsync.Result, HttpWebResponse)
            json = (New StreamReader(response.GetResponseStream)).ReadToEnd

            'create json object
            message = JsonConvert.DeserializeObject(Of ExpandoObject)(json, converter)

            'get top level object and its sub objects
            s = message.value

            For i As Integer = 0 To s.Count - 1

                Dim myuri As Uri = New Uri(s(i).url.ToString)
                Dim vendor_domain As String = myuri.Host

                System.Diagnostics.Debug.WriteLine(icount & "," & myuri.ToString & "," & vendor_domain)
                icount = icount + 1
            Next
            System.Threading.Thread.Sleep(100)

        Next

Upvotes: 1

Related Questions