Reputation: 1181
Using the apache beam Python SDK, I've set up a dataflow that writes to the individual partitions in a date partitioned table on Big Query. According to this documentation every date partitioned table has special NULL
and UNPARTITIONED
partitions. According to those docs I can write to the UNPARTITIONED
partition by just setting my date far in the past or future, but how can I write to the NULL
partition?
I'm trying to load data to a partition based off values in the data and sometimes the field is null. I'd rather write to the NULL
partition than make up a date to use for nulls.
For reference, I write to date partitions doing something like this:
beam.io.Write(beam.io.BigQuerySink(table_id+'$20180925',
project=project_id, dataset=dataset_id, schema=schema))
What do I need to replace $20180925
with to write to NULL
?
Upvotes: 0
Views: 2194
Reputation: 2315
The NULL partition is only available in tables that are partitioned by a column in the data, not on ingestion time partitioned tables. If you are writing to a column partitioned table, you can simply not populate the value of that column in a particular row and use table_id (without any partition suffix) to write to the NULL partition.
Upvotes: 4