J. Ayo
J. Ayo

Reputation: 560

Using a wildcard query in Big Query to count the number of records by date across a dataset

I am using the following query:

SELECT 
  _TABLE_SUFFIX AS Table, 
  Date, 
  Count(*) AS Records 
FROM `project.dataset.table_*` 
GROUP BY Date, Table

The issue that I am having is that the result of this query is only looking at a subset of the data (specifically the last 60 days).

If I connect to one of the tables directly (through Data Studio for example), I can use COUNT_DISTINCT(Date) and it gives me a result of ~360.

The source tables in the dataset are date partitioned, could this be causing the issue?

Upvotes: 1

Views: 337

Answers (1)

Enrique Zetina
Enrique Zetina

Reputation: 835

I made a test with a public dataset in BigQuery, the data is a google analytics sample which use this kind of partitioned tables and I got approx 100 results, so, I don't think that the partition be the cause of the issue.

Here the query I used, you can try it in your own project since is a public dataset:

SELECT _TABLE_SUFFIX AS Table, 
Date, 
Count(*) AS Records 
FROM `bigquery-public-data.google_analytics_sample.ga_sessions_2017*`
GROUP BY Date, Table

And I get this result:

BigQuery result

My suggestion is to verify the FROM clause maybe where you put the '*' could be the key, also you can try make test with the public dataset mentioned before to run your own query.

Upvotes: 1

Related Questions