Hello lad
Hello lad

Reputation: 18790

S3 partitioning strategy

fully aware of the documentations about how to name a S3 object within a bucket to optimize performance

can not understand the example in this article https://aws.amazon.com/blogs/aws/amazon-s3-performance-tips-tricks-seattle-hiring-event/

2134857/gamedata/start.png
2134857/gamedata/resource.rsrc 
2134857/gamedata/results.txt
2134858/gamedata/start.png
2134858/gamedata/resource.rsrc
2134858/gamedata/results.txt
2134859/gamedata/start.png
2134859/gamedata/resource.rsrc
2134859/gamedata/results.txt

the article says "All these reads and writes will basically always go to the same partitio"

but we should have three partitions

2134857, 2134858, 2134859

, right ?

if we reverse the id

7584312/gamedata/start.png
7584312/gamedata/resource.rsrc
7584312/gamedata/results.txt
8584312/gamedata/start.png
8584312/gamedata/resource.rsrc
8584312/gamedata/results.txt
9584312/gamedata/start.png
9584312/gamedata/resource.rsrc
9584312/gamedata/results.txt

we have also three partitions 7584312, 8584312, 9584312

what is the difference.

What is the definition of a prefix and its relationship to partitioning strategy.

Upvotes: 4

Views: 3402

Answers (1)

Matt Beckman
Matt Beckman

Reputation: 5012

The S3 partitioning does not (always) occur on the full ID. It will usually be some sort of partial match on the ID. It's likely your first example will be on the same partition using a partition match of, say, 2134, 21348, or 213485.

More info from the blog post you linked to:

As we said, S3 has automation that continually looks for areas of the keyspace that need splitting. Partitions are split either due to sustained high request rates, or because they contain a large number of keys (which would slow down lookups within the partition). ... This split operation happens dozens of times a day all over S3 and simply goes unnoticed from a user performance perspective.

Upvotes: 8

Related Questions