Reputation: 1537
I was at the interview for middle/senior PHP developer couple of days ago (I failed). I've been asked the tricky question still bothering me.
Imagine we use PHP + Memcached and extremely loaded project (about 100 hits per second or more). We have huge SQL query to perform, so we decide to cache it in Memcached. Now the cache has expired, and we need to perform that huge query again to cache it, but the problem is that all 100 users get on the site simultaneously, so theoretically the server gonna make SQL query 100 times at the same time to cache it after, so the server will go down I think.
How could we solve this? I think the query should perform one time, and another 99 guys should stay and wait the data existing in Memcached.
Upvotes: 1
Views: 826
Reputation: 884
My choice would be to add an additional cache status ("stale" for example) to mark cache entries that have been expired and are in a state of getting updated.
So if a php process requests the cached data from memcached and find out its state is "expired", it will set the state to "stale", fetch a fresh version from the database and stores it as "valid" in memcached.
If another php process accesses the cached data with "stale" status, it will simple use the stale version, but will not fetch a fresh version from the database.
Now if you want to squeeze out the last bit of performance, the first php process wouldn't fetch the fresh data itself, but instead it would delegate this task to another instance (with rabbitmq for example) and also return the stale data.
Upvotes: 2
Reputation: 13417
We have huge SQL query to perform,
How often (lets assume hourly?). How long does it take (lets assume 30m)
First, no one (client, server, php, user) should be firing off the event that makes the query happen. You don't want anything on the front end that has to do with serving page requests blocking.
Instead, you execute the query in the background, thread, another machine, cron job (example: run query every 1 hour to provide fresh results). When the query completes you can write into memcache, all the while the system is running.
This way no page requests cause the query to fire (and thus block), additionally, you can steadily/consistently handle the 100 requests you were previously.
Additionally, you wouldn't execute 100 copies of the query in mysql. It would execute some in parallel (blocking the rest), but the other 90~ queries would all hit the sql query cache, so it wouldn't literally run the query 100 times.
I doubt you want to work at this place anyways.
Hopefully that makes sense!
Upvotes: 2