Christopher Lörken
Christopher Lörken

Reputation: 2770

Nginx php-fpm clogs up with writing connections under high load

we have nginx/1.6.2 running with php5-fpm (5.6) on a debian 8 system.

In the past days we got higher load than usual due to more users hitting our servers. With most visitors coming in the evening hours between 6pm and midnight.

Since a couple of days, two different servers runnning the above setup showed very slow response rates for several hours. In Munin, we saw, that there were suddenly hundreds of nginx connections in "writing" state were there were previously only about 20 at a time.

We do not get any errors other than timed out connections on remote hosts when trying to access those servers. All logs I saw were just normal.

The problem can be fixed with a restart of php5-fpm.

My question now is: why do suddenly hundreds of processes claim they are writing? Is there some known issue or maybe config setting we missed which could cause this?

Here is the complete list of symptoms we see:

For the setup: as stated above. We use the build-in opcode cache of Zend, the APCu for some user variable cache, one of the servers runs a memcache instance (which works fine throughout the problem) and the other is running a Redis version, which also runs fine while the problem occurs.

Can anyone shed some light to what the problem might be?

Thanks!

Upvotes: 0

Views: 900

Answers (2)

user1518820
user1518820

Reputation: 61

We had the same problem, and the reason for that was that the data in Redis was more than the "maxmemory" so redis was unable to write any more data. I could login with redis-cli but couldn't set a value, if you are having this issue, you could login to redis using redis-cli and try to set something, if the redis memory is full you'll get an error.

Upvotes: 0

Christopher Lörken
Christopher Lörken

Reputation: 2770

We found the problem: APCu seems to be unstable with PHP 5.6.

Details:

  • debian 8
  • nginx/1.6.2
  • PHP 5.6.14-0+deb8u1
  • APCu 4.0.7 (Revision: 328290, 126M shm_size)

we used xhprof to profile requests when the server was slow (see question) and noticed, that APCu took > 100ms per read/write operation. Clearing the APCu variables did not help. All other parts of the code had normal speed.

We completely disabled our use of APCu and the system has been stable since.

So it seems, that this APCu version is unstable under load with PHP 5.6. At least for us.

Upvotes: 0

Related Questions