Jacob Schaer
Jacob Schaer

Reputation: 737

BigQuery Results Not Including Page Token

We're working on the pandas.io.gbq and we are noticing some unusual behaviour for large results sets. The code works roughly as follows (where bq is from bq.py). The first few pages of data return valid pageTokens, but after that every couple returns none. A trimmed copy of the JSON returned can be viewed at: https://gist.github.com/jacobschaer/8309204

import bq
import bigquery_client
#.
#.
#.
client = bq.Client.Get()
kwds = {'timeoutMs': 0, u'projectId': u'xxxxxxx', 'startIndex': 0, 'maxResults': 1000000, u'jobId': u'bqjob_r36320b28158a7c96_000001436eb0431c_1'}
data = client.apiclient.jobs().getQueryResults(**kwds).execute()

This might be related to: BigQuery paging issues with tableData.list()

Ultimately, we are winding up with duplicates in the result set.

Upvotes: 1

Views: 958

Answers (1)

Eric
Eric

Reputation: 81

I can help you get to the bottom of this. The snippet of code above kicks of the fetching the first set of query results and should produce a page token. It sounds like it's the subsequent calls where you're running an into an issue - can you show me how you are making the subsequent calls?

Could you clarify something for me as well, you also mentioned that the first few pages return valid page tokens and then every couple pages return none. Did you mean you are getting pages that contain no pagetokens at all? Or did you mean there are pages that contain a valid page token but no results?

Upvotes: 1

Related Questions