Reputation: 11247
I want to understand how Pinterest do their sharding according to this video, but can't seem to fully understand how. I'm interested because I want to apply the same strategy to my app and built sharding myself on top of Amazon RDS.
From my understanding:
If I assume the following mapping table:
Virtual Shard 1 -> 127.0.0.1
Virtual Shard 2 -> 127.0.0.1
....
Looking at how they built their unique ID (Shard ID + Type + Local Auto Increment), what if I decided to add another server 12.0.0.2 because the data capacity of 127.0.0.1 is getting pretty big and I want to add more machines to increase capacity?
How can I exactly map the shard to new servers? I understand that data doesn't move according to the lecture, so how can they don't have hot-spots. I really can't understand how it was done, can someone give me a good step by step explanation? Thanks
Upvotes: 3
Views: 1670
Reputation: 125
pinterest engineering blog descirbe it
"Adding more capacity
In our system, there are three primary ways to add more capacity. The easiest is to upgrade the machines (more space, faster hard drives, more RAM, whatever your bottleneck is).
The next way to add more capacity is to open up new ranges. Initially, we only created 4,096 shards even though our shard ID is 16 bits (64k total shards). New objects could only be created in these first 4k shards. At some point, we decided to create new MySQL servers with shards 4,096 to 8,191 and started filling those.
The final way we add capacity is by moving some shards to new machines. If we want to add more capacity to MySQL001A (which has shards 0 to 511), we create a new master-master pair with the next largest names (say MySQL009A and B) and start replicating from MySQL001A. "
Upvotes: 1
Reputation: 180124
Tumblr has an open source library called Jetpants that handles their sharding needs. You can take a look at how they handle all of these things. To my knowledge, Pinterest hasn't released their particular implementation.
As I noted in my comment, though, in most cases the answer to "how should I shard" is "don't shard, there are better options for virtually all sites".
Upvotes: 1