gkatzioura
gkatzioura

Reputation: 2810

Google Dataflow store to specific Partition using BigQuery Storage Write API

I want to store data to BigQuery by using specific partitions. The partitions are ingestion-time based. I want to use a range of partitions spanning over two years. I use the partition alias destination project-id:data-set.table-id$partition-date. I get failures since it does recognise the destination as an alias but as an actual table. Is it supported?

Upvotes: 1

Views: 416

Answers (2)

gkatzioura
gkatzioura

Reputation: 2810

Direct writes to partitions by ingestion time is not supported using the Write API. Also using the stream api is not supported if a window of 31 days has passed

From the documentation:

When streaming using a partition decorator, you can stream to partitions within the last 31 days in the past and 16 days in the future relative to the current date, based on current UTC time.

The solution that works is to use BigQuery load jobs to insert data. This can handle this scenario.

Because this operation has lot's of IO involved (files getting created on GCS), it can be lengthy, costly and resource intensive depending on the data. A approach can be to create table shards and split the Big Table to small ones so the Storage Read and the Write api can be used. Then load jobs can be used from the sharded tables towards the partitioned table would require less resources, and the problem is already divided.

Upvotes: 0

Israel Herraiz
Israel Herraiz

Reputation: 656

When you ingest data into BigQuery, it will land automatically in the corresponding partition. If you choose a daily ingestion time as partition column, that means that every new day will be a new partition. To be able to "backfill" partitions, you need to choose some other column for the partition (e.g. a column in the table with the ingestion date). When you write data from Dataflow (from anywhere actually), the data will be stored in the partition corresponding to the value of that column for each record.

Upvotes: 1

Related Questions