dataroot
dataroot

Reputation: 45

Query for listing Datasets and Number of tables in Bigquery

So I'd like make a query that shows all the datasets from a project, and the number of tables in each one. My problem is with the number of tables.

Here is what I'm stuck with :

SELECT
  smt.catalog_name        as `Project`,
  smt.schema_name         as `DataSet`, 
  ( SELECT
      COUNT(*) 
    FROM ***DataSet***.INFORMATION_SCHEMA.TABLES
  )                       as `nbTable`,
  smt.creation_time,
  smt.location
FROM
  INFORMATION_SCHEMA.SCHEMATA smt
ORDER BY DataSet

The view INFORMATION_SCHEMA.SCHEMATA lists all the datasets from the project the query is executed, and the view INFORMATION_SCHEMA.TABLES lists all the tables from a given dataset.

The thing is that the view INFORMATION_SCHEMA.TABLES needs to have the dataset specified like this give the tables informations : dataset.INFORMATION_SCHEMA.TABLES

So what I need is to replace the *** DataSet*** by the one I got from the query itself (smt.schema_name). I am not sure if I can do it with a sub query, but I don't really know how to manage to do it.

I hope I'm clear enough, thanks in advance if you can help.

Upvotes: 1

Views: 6709

Answers (1)

Daniel Zagales
Daniel Zagales

Reputation: 3032

You can do this using some procedural language as follows:

CREATE TEMP TABLE table_counts (dataset_id STRING, table_count INT64);

FOR record IN 
  (
    SELECT 
      catalog_name as project_id,
      schema_name as dataset_id
    FROM `elzagales.INFORMATION_SCHEMA.SCHEMATA`
  )

DO 
  EXECUTE IMMEDIATE 
    CONCAT("INSERT table_counts (dataset_id, table_count)  SELECT table_schema as dataset_id, count(table_name) from ", record.dataset_id,".INFORMATION_SCHEMA.TABLES GROUP BY dataset_id");

END FOR;

SELECT * FROM table_counts;

This will return something like: enter image description here

Upvotes: 3

Related Questions