Reputation: 369
As per the GCP documentation :
Long-term storage:
If a table is not edited for 90 consecutive days, the price of storage for that table automatically drops by approximately 50 percent. There is no degradation of performance, durability, availability, or any other functionality when a table is considered long term storage.
Is there any way to list out all the tables which falls under this criteria in a Bigquery dataset?
Upvotes: 0
Views: 724
Reputation: 14791
One way is to use some Bash magic. I've knocked this up quickly using Cloud Shell, which has jq
pre-installed. It works by:
bq show
to grab the metadata information about the tablejq
pull out the id
and numLongTermBytes
from each tablenumLongTermBytes
is how much storage is in long term. If it's 0, then it's active storage.Not sure how this would work with partitions & clustered tables (you'll need to test), but should be enough to get you started at least. You could of course write a neater Python script or something, but I like the 'simplicity' of Bash :)
Noted that this is somewhat of a fragile/brittle solution.
bq ls --format=json | jq -r '.[].id' | xargs -n 1 -P 4 -i bq ls --format=json {} | jq -r '.[].id' | xargs -n 1 -P 4 -i bq show --format=json {} | jq -r '.id + "," + .numLongTermBytes'
Upvotes: 7