Muhammad Gelbana
Muhammad Gelbana

Reputation: 4010

How to count row counts without running a SELECT COUNT(*) query?

I'm trying to get the count of rows in a BigQuery table, using Google App Script. This is how I try to do so

BigQuery.Tables.get(projectId, datasetId, tableId).numRows

No matter what tableId I'm inserting, I'm always getting the value 265. I'm at least sure that a single table has hundreds of millions of rows. So why am I getting 265 as a value for the numRows property ?!

And how can I get the total amount of rows for a BigQuery table without having to run a SELECT COUNT(*) FROM datasetId.tableId query so I don't consume much from my quota ?

Upvotes: 1

Views: 889

Answers (1)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 173046

How can I get the total amount of rows for a BigQuery table without having to run a SELECT COUNT(*) FROM datasetId.tableId query so I don't consume much from my quota ?

1 - You might be surprised but SELECT COUNT(*) FROM datasetId.tableId does not incur any cost for you because BigQuery engine is smart enough to just get this stats from so called meta table. You can see the cost message for such type of query if you will open validator (to the right to buttons on BigQuery UI). The message will be - This query will process 0 B when run

2 - Moreover - you can query meta tables by your own as below example shows:

#standardSQL
SELECT table_id,
    DATE(TIMESTAMP_MILLIS(creation_time)) AS creation_date,
    DATE(TIMESTAMP_MILLIS(last_modified_time)) AS last_modified_date,
    row_count,
    size_bytes,
    CASE
        WHEN type = 1 THEN 'table'
        WHEN type = 2 THEN 'view'
        WHEN type = 3 THEN 'external'
        ELSE '?'
    END AS type,
    TIMESTAMP_MILLIS(creation_time) AS creation_time,
    TIMESTAMP_MILLIS(last_modified_time) AS last_modified_time,
    dataset_id,
    project_id
FROM `projectid.datasetid.__TABLES__` 

Upvotes: 4

Related Questions