Reputation: 3323
I work on Bing News API v7 integration. More precisely, I use https://api.cognitive.microsoft.com/bing/v7.0/news/search
API endpoint.
I found some 'unexpected' behavior for paging. (Expected behavior is each page has the constant size).
On this page is explained how to How to page through results.
I follow that approach. I use 30 for page size; because of that, values for offset are 0, 30, 60, and so on.
For example, when using these parameters: query "Java 14", market "en-US", sort by date, and values for offset are 0, 30, 60, 90, 120, 150
(/bing/v7.0/news/search?q=Java 14&count=30&offset=0&mkt=en-US&sortBy=date
).
I get six pages of results, and each contains less than 30 URLs per page.
Page: 0 Total: 27 results
Page: 1 Total: 26 results
Page: 2 Total: 26 results
Page: 3 Total: 29 results
Page: 4 Total: 29 results
Page: 5 Total: 7 results
...
This Stackoverflow What's the expected behavior of the Bing Search API v5 when deeply paginating? is related to Bing API v5. Paging values don't follow the fixed-size sequence, but the formula is previous result size + 1
.
So, my question is:
Which values should I use for offset for the second page (Page: 1
)? Is it 28 or is it 30? And which value for the third page (Page 2
): 54 or 60?
Upvotes: 0
Views: 972
Reputation: 1216
Make a first pass to the api to determine totalEstimatedMatches. Divide totalEstimatedMatches / 25 or size of each page to get number of api calls to make. For example if totalEstimatedMatches = 100 then make 4 api calls which should return 25 urls each. I play it safe and reduce that by 1 but you could put that in a try catch. s.Count in this example will be 25. Solution in VB.Net but you get the idea.
'the secret key
Dim accessKey As String = "xxxxxxxxxxxxxxxxxxxxxxxxx"
Dim endpoint As String = "https://api.cognitive.microsoft.com/bing/v7.0/news/search?"
Dim queryString = HttpUtility.ParseQueryString(String.Empty)
queryString("q") = search_criteria 'Uri.EscapeDataString(search_criteria)
queryString("mkt") = market
queryString("count") = "25"
queryString("offset") = "0"
queryString("freshness") = freshness
queryString("SafeSearch") = "strict"
' Construct the URI of the search request
uriQuery = endpoint & queryString.ToString
' Perform the Web request and get the response
request = HttpWebRequest.Create(uriQuery)
request.Headers.Add("Ocp-Apim-Subscription-Key", accessKey)
response = CType(request.GetResponseAsync.Result, HttpWebResponse)
json = (New StreamReader(response.GetResponseStream)).ReadToEnd
'create json object
Dim converter = New ExpandoObjectConverter()
Dim message As Object = JsonConvert.DeserializeObject(Of ExpandoObject)(json, converter)
'get top level object and its sub objects
s = message.value
Try
totalEstimatedMatches = CInt(message.totalEstimatedMatches)
total_available_for_processing = s.Count
Catch ex As Exception
End Try
'get total number of pages availble at 25 records per page, so we page thru 25 records at a time and then call api
Dim page_count As Integer = totalEstimatedMatches / 25
'loop thru page_count and
For p As Integer = 0 To page_count - 1
If p = 0 Then
queryString("count") = "25"
queryString("offset") = "0"
Else
'determine offset
queryString("count") = "25"
queryString("offset") = p * 25
End If
' Construct the URI of the search request
uriQuery = endpoint & queryString.ToString
' Perform the Web request and get the response
request = HttpWebRequest.Create(uriQuery)
request.Headers.Add("Ocp-Apim-Subscription-Key", accessKey)
response = CType(request.GetResponseAsync.Result, HttpWebResponse)
json = (New StreamReader(response.GetResponseStream)).ReadToEnd
'create json object
message = JsonConvert.DeserializeObject(Of ExpandoObject)(json, converter)
'get top level object and its sub objects
s = message.value
For i As Integer = 0 To s.Count - 1
Dim myuri As Uri = New Uri(s(i).url.ToString)
Dim vendor_domain As String = myuri.Host
System.Diagnostics.Debug.WriteLine(icount & "," & myuri.ToString & "," & vendor_domain)
icount = icount + 1
Next
System.Threading.Thread.Sleep(100)
Next
Upvotes: 1