Graham Polley
Graham Polley

Reputation: 14791

Setting table expiration time using Dataflow BigQuery sink

Is there way to set the expiration time on a BigQuery table when using Dataflow's BigQueryIO.Write sink?

For example, I'd like something like this (see last line):

PCollection<TableRow> mainResults...
mainResults.apply(BigQueryIO.Write
                .named("my-bq-table")
                .to("PROJECT:dataset.table")
                .withSchema(getBigQueryTableSchema())
                .withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_TRUNCATE)
                .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED))
                .withExpiration(1452030098l) //**this table should expire on 31st Jan

I can't see anything in the Dataflow API that would facilitate this. Of course, I could just use the BigQuery API, but it would be much better to be able to this in the via Dataflow when specifying the sink.

Upvotes: 1

Views: 805

Answers (2)

Michael Sheldon
Michael Sheldon

Reputation: 2057

You can set a defaultTableExpirationMs on a dataset, and then any table created within that dataset will have an Expiration Time of "now + dataset.defaultTableExpirationMs".

See https://cloud.google.com/bigquery/docs/reference/v2/datasets#defaultTableExpirationMs

Upvotes: 0

Sam McVeety
Sam McVeety

Reputation: 3214

This isn't currently supported in the Dataflow API. We can look at adding it soon, as it should be a straightforward addition.

Upvotes: 2

Related Questions