Reputation: 1388
I have 8Gb table in BigQuery that I'm trying to export to Google Cloud Storage (GCS). If I specify url as it is, I'm getting an error
Errors:
Table gs://***.large_file.json too large to be exported to a single file. Specify a uri including a * to shard export. See 'Exporting data into one or more files' in https://cloud.google.com/bigquery/docs/exporting-data. (error code: invalid)
Okay... I'm specifying * in a file name, but it exports it in 2 files: one 7.13Gb and one ~150Mb.
UPD. I thought I should get about 8 files, 1Gb each? Am I wrong? Or what am I doing wrong?
P.S. I tried this in WebUI mode as well as using Java library.
Upvotes: 3
Views: 8617
Reputation: 15632
To export it to GCP you have to go to the table and click EXPORT > Export to GCS.
This opens the following screen
In Select GCS location you define the bucket, the folder and the file.
For instances, you have a bucket named daria_bucket (Use only lowercase letters, numbers, hyphens (-), and underscores (_). Dots (.) may be used to form a valid domain name.) and want to save the file(s) in the root of the bucket with the name test, then you write (in Select GCS location)
daria_bucket/test.csv
Because the file is too big, you're getting an error. To fix it, you'll have to break it down into more files using wildcard. So, you'll need to add *, just like that
daria_bucket/test*.csv
This is going to store, inside of the bucket daria_bucket, all the data extracted from the table in more than one file named test000000000000, test000000000001, test000000000002, ... testX.
In my case (more than 1 year after you've asked the question), using a random table of 1,25 GBs, got 16 files with 80,3 MBs each.
Upvotes: 5
Reputation: 59355
For files of certain size or larger, BigQuery will export to multiple GCS files - that's why it asks for the "*" glob.
Once you have multiple files in GCS, you can join them into 1 with the compose
operation:
gsutil compose gs://bucket/obj1 [gs://bucket/obj2 ...] gs://bucket/composite
Upvotes: 3