favq
favq

Reputation: 789

BigQuery wildcard tables - what are the limits?

I'm thinking about organizing a BigQuery dataset with multiple tables that I can query using wildcards.

In this use case, there will potentially be 1000+ tables, each one representing a different customer. I need to be able to query them individually, and I also need to query them all at the same time.

I'm thinking of using multiple tables rather than a single table with clustering, because, when running queries only for specific customers, I would like to be able to scan only the data that I need.

The Quotas and limits page lists the following limits that look applicable:

My questions are:

1) When running a query with wildcard (such as SELECT * FROM `project_id.dataset_id.table_*`), do all the individual tables that match the wildcard get counted against this 1000 tables limit?

2) What exactly is counted against the 12 MB limit? For example, suppose I have the following tables project_id.dataset_id.table_1, project_id.dataset_id.table_2, project_id.dataset_id.table_3.

3) Are there any other limitations that I should be aware of?

Thank you in advance.

Upvotes: 1

Views: 1799

Answers (2)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 173210

When running a query with wildcard, do all the individual tables that match the wildcard get counted against this 1000 tables limit?

All tables that match wildcard and _TABLE_SUFFIX filter will count against 1000 limit

What exactly is counted against the 12 MB limit?

Length of fully qualified names of all tables counted against above explained 1000 tables limit will be counted against 12 MB limit

Are there any other limitations that I should be aware of?

Suggestion - combination of partitioning and clustering will serve you much better in use case you describe in your question. Especially that now you can partition by integer range
This will give your superior flexibility in maintaining and processing data to compare with sharded tables approach
Obviously, there are still limitations related to partitioned tables to consider

Upvotes: 2

Pentium10
Pentium10

Reputation: 208042

My advice is against. You will quickly reach 1000 tables and than you cannot query them all. The other is that having each table separately it will be unmaintainable.

You can use clustering to your help. If one of your columns is the tenant_id than you can use that in clustering.

Upvotes: 2

Related Questions