BigQueryIO read get TableSchema

Question

What I want to do is read an existing table and generate a new table which has the same schema as the original table plus a few extra column (computed from some columns of the original table). The original table schema can be increased without notice to me (the fields I am using in my dataflow job won't change), so I would like to always read the schema instead of defining some custom class which contains the schema.

In Dataflow SDK 1.x, I can get the TableSchema via

final DataflowPipelineOptions options = ...
final String projectId = ...
final String dataset = ...
final String table = ...

final TableSchema schema = new BigQueryServicesImpl()
    .getDatasetService(options)
    .getTable(projectId, dataset, table)
    .getSchema();

For Dataflow SDK 2.x, BigQueryServicesImpl has become a package-private class.

I read the responses in Get TableSchema from BigQuery result PCollection but I'd prefer not to make a separate query to BigQuery. As that response is now almost 2 years old, are there other thoughts or ideas from the SO community?

BigQueryIO read get TableSchema

Answers (1)

Related Questions