Alex
Alex

Reputation: 91

How does consistent hashing work in PHP memcached module?

I have not been able to answer a few questions about memcached altough I searched a lot on the internet.

  1. What is the consistent-hash algorithm used by PHP's memcached module?
  2. What are its settings (that is, how many times does it add a server on the ring?)
  3. Is the consistent-hash array (continuum ring) cached somewhere, or it is recalculated at every script execution? If it's the latter, would this cause a performance issue/waste of computational power?

Thanks

Upvotes: 1

Views: 679

Answers (1)

ficuscr
ficuscr

Reputation: 7073

  1. What is the hash algorithm used by PHP's memcached module?

There are different ones depending on your needs. See this page for options: http://php.net/manual/en/memcached.constants.php You can use the standard modulo operation on the count of instances in the pool (DISTRIBUTION_MODULA), or a consistent hashing key distribution algorithm (DISTRIBUTION_CONSISTENT). There are also options for the hash algorithm on the item keys themselves if needed (HASH_).

If you are going to be dropping instances in and out of the pool frequently you should look at the consistent hashing approach. I've never actually used it and probably should as it would mitigate impact of server failures, with no measurable performance hit by going that route... seems a no brainer in hindsight. Thinking the memcached extension should maybe change the default to that. I've read that with the consistent hashing you can expect loosing 10-25% of your keys (or less with more servers in the pool) where with the default approach it might be closer to a 100% loss.

To be more specific the consistent algorithm is based on libketama. The readme there does a good job of summarizing why it came about and how it works.

  1. What are its settings (that is, how many times does it add a server on the ring?)

I'm not sure I understand what you are asking here. How many 'ticks' are added to the continuum with each new server? Think we would need to look more at the ketama lib - guessing it tries to strike a balance between performance and fewer key misses.

  1. Is the hash array (continuum ring) cached somewhere, or it is recalculated at every script execution? If it's the latter, would this cause a performance issue/waste of computational power?

Not 100% sure honestly. While it might be a wasted computation, still seems trivial enough - md5 algorithm and some arithmetic. All I can say is Memcached is very fast. If you need high availability, persistent distributed store you might look at something else like couchDB. Memcached is a very good at what it does... in memory key/value store for "small" values.

You might enjoy these articles:

Upvotes: 2

Related Questions