Adam
Adam

Reputation: 17389

How to get a clean BigQuery schema when using a BigTable source?

BigTable: a simple table with a string key and one column family, publication_number, also a string.

This table can be mounted as a dataset in BigQuery by following Google's instructions.

When querying this table I'd expect to see a simple two-column output: rowkey and publication_number, both STRING. Instead I see this:

BigQuery Schema

BigQuery adds a .column, a .column.name, .cell, .cell.timestamp, and finally the value is in .column.cell.value.

Is there a way to simplify this to only have:

rowkey              STRING
publication_number  STRING

And nothing else? I can change both how the table is created and how the export is done.

Upvotes: 1

Views: 473

Answers (1)

Elliott Brossard
Elliott Brossard

Reputation: 33765

One option would be to create a logical view over the table that exposes the schema that you want:

CREATE VIEW dataset.publication_view AS
SELECT
  rowkey,
  cell.value AS publication_number
FROM dataset.publication_table,
UNNEST(publication_number.column)

The following does the same as above, but it allows adding additional columns to the view:

CREATE VIEW dataset.publication_view AS
SELECT
    rowkey,
    ARRAY_TO_STRING(ARRAY(SELECT cell.value FROM UNNEST(publication_number.column)), "") AS publication_number
FROM `dataset.publication_table`

Upvotes: 2

Related Questions