Reputation: 349
We encounter semi-weird behaviour in the MongoDB PHP Driver (v1.3) when executing a slow query. The driver seems to keep opening connections when requests are slow and I don't fully understand why. Maybe you guys have some suggestions here.
Here are some facts first:
Once a day a Memcache record expires and a slow query is done. This leads to PHP opening up to 800 connections to MongoDB (normally we have 10 open connections according to logs). Our website is almost fully Memcached, so our database doesn't have any other significant load. The 800 open connections make the website have 30 second loading times at first and throw several types of MongoExceptions (too many connections / socket exceptions) later on.
It's an ugly query with a group by. To be perfectly clear, we understand this query is slow and idiotic, and we're removing this query today. It's just not clear why it screws up the entire website. We use Doctrine as an abstraction layer, but this is the actual query on a 200,000 document database (3 fields per document: id/product/date)) according to the logs:
{"group":true,"keys":{"product":1},"initial":{"count":0},"reduce":"function (obj, prev) { prev.count++; }","options":[],"db":"Orders","collection":"History"}
After the query is done, its results are written to Memcache for 24h. So all new requests get it from Memcache, not from MongoDB. But still, it sticks around 800 connections, the problem does not solve itself and the website doesn't respond anymore after a while. It takes about 10 minutes to open these 800 connections.
It feels like a typical race condition. The query just doesn't feel heavy enough to actually cause a race condition on this server with this load. I mean, it feels like it shouldn't.
Okay, so the questions are:
Reason I ask this is because our website is growing really fast and we're expecting way more traffic and MongoDB load in the future.
Thanks a lot in advance!
Upvotes: 0
Views: 1521
Reputation: 6922
Given that you're invoking the group
command instead of performing a basic read query, you may also be fighting against the JavaScript interpreter in MongoDB 2.2. It's not until 2.4 that the JavaScript interpreter was enhanced to support concurrent execution. If each of these group operations requires JS evaluation (at the very least for the reduce
function), you're looking at widespread resource starvation.
I don't have any explanation for the "too many connection" exceptions. Even 800 concurrent connections falls well below MongoDB's limit of 20,000 (note: this is being removed for 2.6 in SERVER-8943).
One idea to refactor you application and avoid the group
race condition would be to use a single document as a lock for a PHP process to recompute the result and refill the cache. Using findAndModify
, you could have a single document with some string _id
(e.g. "Order.History group") and another active
field. When a PHP process gets a cache miss and would need to recompute the result, it can first attempt to execute findAndModify
and find the appropriate _id
where active
is false
, updating active
to true
in the same, atomic operation. Only after getting this lock document back should it proceed with the group
command. Other PHP processes that can't find the lock document (because active
will not be false
) could be instructed to sleep for a bit, return stale data, or abort the web request.
Upvotes: 1