Alex Aslanoglou
Alex Aslanoglou

Reputation: 291

BigQuery get table schema via query

Is it possible to get BigQuery table schema info via query? (Something like TABLES but for schema). Not interested in cli approach.

What I'm trying to achieve is to perform an "incremental" update between 2 BQ tables (let`s say staging and production) with the same structure based on 1 "key" column, but with an automatic way to create the key and the rest of the hash based on all other columns.

In MySQL/mariadb would be something like. Get the columns from information_schema.columns for the specific table and string agg all columns based with COLUMN_TYPE = "PRI" and keep is as key, and then get all the others with a string agg also. This would auto generate a SQL query in order to do the rest stuff.

So the starting question is if there is something relevant to MySQL's information_schema.COLUMNS

Upvotes: 19

Views: 60261

Answers (3)

Alvaro
Alvaro

Reputation: 963

UPDATE 2021

It is possible now:

SELECT 
 TO_JSON_STRING(
    ARRAY_AGG(STRUCT( 
      IF(is_nullable = 'YES', 'NULLABLE', 'REQUIRED') AS mode,
      column_name AS name,
      data_type AS type)
    ORDER BY ordinal_position), TRUE) AS schema
FROM
  <YOUR_DATASET>.INFORMATION_SCHEMA.COLUMNS
WHERE
  table_name = <YOUR_TABLE>

Upvotes: 21

thedatastrategist
thedatastrategist

Reputation: 445

Check out the Google documentation on Getting table metadata using INFORMATION_SCHEMA. INFORMATION_SCHEMA provides metadata on your BigQuery datasets.

I believe you want the following:

SELECT * 
FROM `project-id.dataset_name.INFORMATION_SCHEMA.COLUMN_FIELD_PATHS`

Upvotes: 16

Gil Adirim
Gil Adirim

Reputation: 1924

I can see a couple of approaches, though they differ from the MYSQL approach -

  1. If you have to use a query, just create a decorator to wrap a query with a limit 1 clause. While you will be charged the same amount as running a full table scan, it will give you the the schema in the response (see documentation of the query job). Not the best approach due the possibly charges it might incur.
  2. If you aren't tied to the query job, why not use the tables get method? It returns a table resource object which contains the schema

Upvotes: 3

Related Questions