Big Query Dataflow Load to NULL partition

Question

Using the apache beam Python SDK, I've set up a dataflow that writes to the individual partitions in a date partitioned table on Big Query. According to this documentation every date partitioned table has special NULL and UNPARTITIONED partitions. According to those docs I can write to the UNPARTITIONED partition by just setting my date far in the past or future, but how can I write to the NULL partition?

I'm trying to load data to a partition based off values in the data and sometimes the field is null. I'd rather write to the NULL partition than make up a date to use for nulls.

For reference, I write to date partitions doing something like this:

beam.io.Write(beam.io.BigQuerySink(table_id+'$20180925',
    project=project_id, dataset=dataset_id, schema=schema))

What do I need to replace $20180925 with to write to NULL?

Pavan Edara · Accepted Answer

The NULL partition is only available in tables that are partitioned by a column in the data, not on ingestion time partitioned tables. If you are writing to a column partitioned table, you can simply not populate the value of that column in a particular row and use table_id (without any partition suffix) to write to the NULL partition.

Big Query Dataflow Load to NULL partition

Answers (1)

Related Questions