user2254391
user2254391

Reputation: 350

Is there a way to create a Bigquery table with data-dependent schema in Google Dataflow?

I am trying to create a Bigquery table as part of the dataflow. The examples show passing the schema as TableFieldSchema instance. However, the tableschema I have is data dependent, and hence can at best be created as an element in PCollection<TableFieldSchema>. For example:

PCollection<TableRow> quotes = ...;

  quotes.apply(BigQueryIO.Write
      .named("Write")
      .to("my-project:output.output_table")
      .withSchema(schema)
      .withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_TRUNCATE)
   .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED));

Here schema needs to be a TableFieldSchema, but I have it as PCollection<TableFieldSchema>.

Upvotes: 2

Views: 1607

Answers (1)

Davor Bonaci
Davor Bonaci

Reputation: 1729

We, unfortunately, don't have a built-in API to write to a BigQuery table with a dynamic schema. That said, we are working on improving flexibility in this area. No estimates at this time, but we hope to get this soon.

In the meanwhile, some workarounds have been proposed on other StackOverflow questions:

Upvotes: 3

Related Questions