Reputation: 13
I will fetch approximately 500,000 to 1,000,000 rows in BiqQuery. We will limit it to an offset and max. In this case pageSize = max
and startIndex = offset
.
Our data will only be processed once a day and then uploaded to BigQuery.
The documentation recommended using pageToken
instead of startIndex
.
I have done some estimation using pageToken
and startIndex
and could not see any difference in time.
I found one answer here at StackOverflow:
"You should use the page token returned from the original query response or the previous
jobs.getQueryResults()
call to iterate through pages. This is generally more efficient and reliable than using index-based pagination"
But I'm not convinced why I should use pageToken
, then I need to store the token to use it when going back and forth. Timewise, I could not see any difference.
Upvotes: 1
Views: 2103
Reputation: 3642
But I'm not convinced why I should use "pageToken"
There are few but important differences between the two
index-based pagination - Is good when you know how many records are returned from your query and doesn't consider the size of a record (This is important for client-side application
page token - Specific page in the result set not requiring any pre-information to access such as the size of the results
So if in your case you know how many results you have and you don't care about the page size you can use index-based other-wise use page token
Upvotes: 2