Reputation: 31
I've recently deleted around 90 million objects (around 100TB of data) from a "Nearline" GCS bucket, and now that I have an almost-empty bucket it takes >5 seconds to list the single remaining file. Standard buckets of ours that have only a dozen files take ~1s to list.
This occurs consistently from both gsutil
as well as Go-based tooling that we've written. This has been tested from multiple VMs ranging in sizes within GCP from the same region as the buckets. All buckets are single-region, the only difference is that the slower one is Nearline, and the others are Standard. Is it really possible that simply listing the files in a bucket takes more than 5 seconds on Nearline?
Since this smells like a garbage collection/vacuum-related slowdown and we've been using it for almost 5 years now I'm inclined to simply delete the bucket and recreate it, but it'd be good to know if anyone has done an accurate characterization of GCP bucket performance with high churn over time.
Upvotes: 0
Views: 234
Reputation: 31
In the 24h since I've posted this, the performance of this bucket has returned to what I'd consider normal. gsutil ls
takes ~1.2s, my custom Go code for listing the bucket takes ~.15s.
By simply waiting and trying again I've answered my own question: yes, the database of bucket keys does seem to have variable (reduced) performance depending on content churn, but it's something that resolves itself relatively quickly.
Upvotes: 0