supermax2015
supermax2015

Reputation: 69

BigQuery - retrieving large number of rows with parallel pagination

I need to retrieve large query results as fast as possible. BQ allows sequential pagination, but it takes too much time (200K rows in 10 min).

Is it possible to do parallel pagination and if so, is the performance will actually improve in line with the number of the parallel requests?

Upvotes: 1

Views: 1140

Answers (1)

Pentium10
Pentium10

Reputation: 207962

As you probably know your query results are written to a table either anonymous or you can specify a name for it to be permanent.

Having that table you can use the tabledata.list API call to get the data from it. Under the optional params, you will see a startIndex parameter that you can set to whatever you want, and you can use in your pagination script.

You can run parallel API calls using different offsets that will speed your request. Make sure you are not hitting your badwith if the data is too big.

https://cloud.google.com/bigquery/docs/reference/rest/v2/tabledata/list

Upvotes: 2

Related Questions