Imad
Imad

Reputation: 2741

How to set an upper bound to BigQuery's extracted file parts?

Say I have a BigQuery table that contains 3M rows, and I want to export it to gcs. What I do is standard bq extract <flags> ... <project_id>:<dataset_id>.<table_id> gs://<bucket>/file_name_*.<extension>

I am bound by a limit on the number of rows a file (part) can have. Is there a way to set a hard limit to the size of a file part?

For example, If I want each partition not to be above 10Mb for example, or even better, to set the maximum number of rows allowed to go in a file part? The documentation doesn't seem to mention any flags for this purpose.

Upvotes: 0

Views: 60

Answers (1)

guillaume blaquiere
guillaume blaquiere

Reputation: 75715

You can't do it with BigQuery extract API.

But you can script it (perform an export of thousands of row in a loop) but you will have to pay for the processed data (the extract is free!). You can also set up a Dataflow job for this (but it's also not free!).

Upvotes: 2

Related Questions