Reputation: 18790
fully aware of the documentations about how to name a S3 object within a bucket to optimize performance
can not understand the example in this article https://aws.amazon.com/blogs/aws/amazon-s3-performance-tips-tricks-seattle-hiring-event/
2134857/gamedata/start.png
2134857/gamedata/resource.rsrc
2134857/gamedata/results.txt
2134858/gamedata/start.png
2134858/gamedata/resource.rsrc
2134858/gamedata/results.txt
2134859/gamedata/start.png
2134859/gamedata/resource.rsrc
2134859/gamedata/results.txt
the article says "All these reads and writes will basically always go to the same partitio"
but we should have three partitions
2134857, 2134858, 2134859
, right ?
if we reverse the id
7584312/gamedata/start.png
7584312/gamedata/resource.rsrc
7584312/gamedata/results.txt
8584312/gamedata/start.png
8584312/gamedata/resource.rsrc
8584312/gamedata/results.txt
9584312/gamedata/start.png
9584312/gamedata/resource.rsrc
9584312/gamedata/results.txt
we have also three partitions 7584312, 8584312, 9584312
what is the difference.
What is the definition of a prefix and its relationship to partitioning strategy.
Upvotes: 4
Views: 3402
Reputation: 5012
The S3 partitioning does not (always) occur on the full ID. It will usually be some sort of partial match on the ID. It's likely your first example will be on the same partition using a partition match of, say, 2134, 21348, or 213485.
More info from the blog post you linked to:
As we said, S3 has automation that continually looks for areas of the keyspace that need splitting. Partitions are split either due to sustained high request rates, or because they contain a large number of keys (which would slow down lookups within the partition). ... This split operation happens dozens of times a day all over S3 and simply goes unnoticed from a user performance perspective.
Upvotes: 8