Graham
Graham

Reputation: 71

How can I extract a single partition from a partitioned BigQuery table?

According to the BigQuery docs, I should be able to export a single partition of a partitioned table:

Exporting all data from a partitioned table is the same process as exporting data from a non-partitioned table. For more information, see Exporting table data. To export data from an individual partition, append the partition decorator, $date, to the table name. For example: mytable$20160201.

However running the following extract command extracts the entire table, not just one partition. It is driving me nuts! What am I doing wrong?

bq --location=europe-west2 extract \
--destination_format NEWLINE_DELIMITED_JSON \
--compression GZIP \
bq-project-name:dataset.table_name$20200405 \
"gs://bucket-name/test_ga_sessions*.json.gz"

Adding partitioning information of source table here

Source Table Partitioning Info

I have also confirmed that the partition I am attempting to extract exists

#legacySQL
SELECT
  partition_id,
  creation_time,
  creation_timestamp,
  last_modified_time,
  last_modified_timestamp
FROM
  [dataset.tablename$__PARTITIONS_SUMMARY__]
  where partition_id = '20200405'

enter image description here

Upvotes: 2

Views: 1865

Answers (1)

Graham
Graham

Reputation: 71

Because I was running the bq extract command in a bash shell, the partition decorator $20200405 was being interpreted as a variable and an empty one at that. Therefore the full partition identifier of bq-project-name:dataset.table_name$20200405 was being interpreted as bq-project-name:dataset.table_name by the time the request reached BigQuery.

In order to get this command to run correctly, all I had to do was escape the $ character of the partition decorator with a backslash as follows:

bq --location=europe-west2 extract \
--destination_format NEWLINE_DELIMITED_JSON \
--compression GZIP \
bq-project-name:dataset.table_name\$20200405 \
"gs://bucket-name/test_ga_sessions*.json.gz"

Upvotes: 2

Related Questions