Jon Chase
Jon Chase

Reputation: 563

Hard limit on number of tables in a BQ project

I've got some highly partitionable data that I'd like to store in BigQuery, where each partition would get its own table. My question is if BQ will support the number of tables I'll need.

With my data set, I'd be creating approximately 2,000 new tables daily. All tables would have a 390 day (13 month) expiration, so eventually there'd be a constant count of ~ 2,000 tables * 390 days = ~780,000 tables in this particular project.

I'd test this myself, but BQ only supports a max of 10,000 load jobs per project per day.

Does anyone have experience with this sort of table count? Is there any official table limit provided by Google?

Upvotes: 4

Views: 1258

Answers (2)

shollyman
shollyman

Reputation: 4384

There are projects with that number of distinct tables today. There is not currently a hard cap on the number of distinct tables.

Some related considerations that come to mind when you're contemplating representations that use that many tables:

  • A query (including referenced views) can currently only reference 1000 tables.

  • Datasets with large numbers of tables may exhibit problematic behavior when using table wildcard functions.

  • You may be oversharding. Rather than lots of individual tables, you may simply want to use a wider schema and fewer tables.

  • If you're heavily dependent on time intervals as a sharding consideration, you may also want to look at table decorators as a way of limiting the scope of data scans.

  • You may also want to collapse data over time into fewer, larger tables as they age and are less frequently accessed. For example, copy jobs can append multiple source tables into a single destination table.

Upvotes: 6

Felipe Hoffa
Felipe Hoffa

Reputation: 59325

Most limits can be raised in BigQuery, as long as you are using BigQuery right - the limits are there to prevent abuse and misuse.

A critical question here - how much data will each table handle? Having 780,000 tables with 10 rows isn't a good idea.

How many tables do you want to handle per query? There's a hard limit of 1,000 tables per query.

If you have an interesting use case that requires higher limits, getting a support contract and their advice is the best way of having default limits raised.

https://cloud.google.com/support/

Upvotes: 2

Related Questions