Reputation: 73
I've just started reading about Cassandra and I can't quite understand how Cassandra manages to decide which nodes should it write the data to.
What I understand, is that Cassandra uses a part of primary key, specifically partition key, and partitioner to get a token by hashing the partition key, therefore a node/vnode to which that token is bound to. Now let's say I have 2 nodes in my cluster, each has 256 vnodes on it + I'm not using any clustering keys, just a simple PK and a bunch of simple columns. Hashing partition key would clearly determine where the data should go. By this logic, there would be only 512 unique records available for storage. Would be funny if true. So am I wrong at the partitioner part?
Upvotes: 0
Views: 103
Reputation: 4887
Consider the base case: just a single node, with a single token. Do you think it can story only one record? Of course not.
The hash determines which node the row will go to, true. But the primary key determines where in the node the row will be stored. And many distinct primary keys may result in the same hash, but they will all be stored separately by the node.
Upvotes: 1