Reputation: 6251

MySQL code causes PHP script to crash at popen/exec

I have the following PHP 5.6.19 code on a Ubuntu 14.04 server. This code simply connects to a MySQL 5.6.28 database, waits a minute, launches another process of itself, then exits.

Note: this is the full script, and it's purpose is to demonstrate the problem - it doesn't do anything useful.

class DatabaseConnector {
    const DB_HOST = 'localhost';
    const DB_NAME = 'database1';
    const DB_USERNAME = 'root';
    const DB_PASSWORD = 'password';

    public static $db;

    public static function Init() {
        if (DatabaseConnector::$db === null) {
            DatabaseConnector::$db = new PDO('mysql:host=' . DatabaseConnector::DB_HOST . ';dbname=' . DatabaseConnector::DB_NAME . ';charset=utf8', DatabaseConnector::DB_USERNAME, DatabaseConnector::DB_PASSWORD);
        }
    }
}

$startTime = time();

// ***** Script works fine if this line is removed.
DatabaseConnector::Init();

while (true) {
    // Sleep for 100 ms.
    usleep(100000);

    if (time() - $startTime > 60) {
        $filePath = __FILE__;
        $cmd = "nohup php $filePath > /tmp/1.log 2>&1 &";

        // ***** Script sometimes exits here without opening the process and without errors.
        $p = popen($cmd, 'r');

        pclose($p);

        exit;
    }
}

I start the first process of the script using nohup php myscript.php > /tmp/1.log 2>&1 &.

This process loop should go on forever but... based on multiple tests, within a day (but not instantly), the process on the server "disappears" without reason. I discovered that the MySQL code is causing the popen code to fail (the script exits without any error or output).

What is happening here?

Notes

The server runs 24/7.
Memory is not an issue.
The database connects correctly.
The file path does not contain spaces.
The same problem exists when using shell_exec or exec instead of popen (and pclose).

I also know that popen is the line that fails because I did further debugging (not shown above) by logging to a file at certain points in the script.

Upvotes: 4

Answers (4)

porfirion

Reputation: 1699

I suggest that process doesn't exit after pclose. In this case every process holds it's own connection to db. After some time connectons limit of MySQL is reached and new connection fails. To understand what's going on - add some logs before and after strings DatabaseConnector::Init(); and pclose($p);

Upvotes: 0

jstephenson

Reputation: 2180

Is the parent process definitely exiting after forking? I had thought pclose would wait for the child to exit before returning.

If it isn't exiting, I'd speculate that because the mySQL connection is never closed, you're eventually hitting its connection limit (or some other limit) as you spawn the tree of child processes.

Edit 1

I've just tried to replicate this. I altered your script to fork every half-second, rather than every minute, and was able to kill it off within about 10 minutes.

It looks like the the repeat creation of child processes is generating ever more FDs, until eventually it can't have any more:

$ lsof | grep type=STREAM | wc -l
240
$ lsof | grep type=STREAM | wc -l
242
...
$ lsof | grep type=STREAM | wc -l
425
$ lsof | grep type=STREAM | wc -l
428
...

And that's because the child's inheriting the parent's FDs (in this case for the mySQL connection) when it forks.

If you close the mySQL connection before popen with (in your case):

DatabaseConnector::$db = null;

The problem will hopefully go away.

Upvotes: 3

axiac

Reputation: 72286

I had a similar situation using pcntl_fork() and a MySQL connection. The cause here is probably the same.

Background info

popen() creates a child process. The call to pclose() closes the communication channel and the child process continues to run until it exits. This is when the things start to go out of control.

When a child process completes, the parent process receives a SIGCHLD signal. The parent process here is the PHP interpreter that runs the code you posted. The child process is the one launched using popen() (it doesn't matter what command it runs).

There is a small thing here you probably don't know or you have found in the documentation and ignored it because it doesn't make much sense when one programs in PHP. It is mentioned in the documentation of sleep():

If the call was interrupted by a signal, sleep() returns a non-zero value.

The sleep() PHP function is just a wrapper of the sleep() Linux system call (and usleep() PHP function is a wrapper of the usleep() Linux system call.)

What is not told in the PHP documentation is clearly stated in the documentation of the system calls:

sleep() makes the calling thread sleep until seconds seconds have elapsed or a signal arrives which is not ignored.

Back to your code.

There are two places in your code where the PHP interpreter calls the usleep() Linux system function. One of them is clearly visible: your PHP code invokes it. The other one is hidden (see below).

What happens (the visible part)

Starting with the second iteration, if a child process (created using popen() on a previous iteration) happens to exit while the parent program is inside the usleep(100000) call, the PHP interpreter process receives the SIGCHLD signal and its execution resumes before the time being out. The usleep() returns earlier than expected. Because the timeout is short, this effect is not observable by the naked eye. Put 10 seconds instead of 0.1 seconds and you'll notice it.

However, apart from the broken timeout, this doesn't affect the execution of your code in a fatal manner.

Why it crashes (the invisible part)

The second place where an incoming signal hurts your programs execution is hidden deep inside the code of the PHP interpreter. For some protocol reasons, the MySQL client library uses sleep() and/or usleep() in several places. If the interpreter happens to be inside one of these calls when the SIGCHLD arrives, the MySQL client library code is resumed unexpectedly and, many times, it concludes with the erroneous status "MySQL server has gone away (error 2006)".

It's possible that your code ignores (or swallows) the MySQL error status (because it doesn't expect it to happen in that place). Mine didn't and I spent a couple of days of investigation to find out the facts summarized above.

A solution

The solution for the problem is easy (after you know all the internal details exposed above). It is hinted in the documentation quote above: "a signal arrives which is not ignored".

The signals can be masked (ignored) when their arrival is not desired. The PHP PCNTL extension provides the function pcntl_sigprocmask(). It wraps the sigprocmask() Linux system call that sets what signals can be received by the program from now on (in fact, what signals to be blocked).

There are two strategies you can implement, depending of what you need.

If your program needs to communicate with the database and be notified when the child processed complete then you have to wrap all your database calls within a pair of calls to pcntl_sigprocmask() to block then unblock the SIGCHLD signal.

If you doesn't care when the child processes complete then you just call:

pcntl_sigprocmask(SIG_BLOCK, array(SIGCHLD));

before you start creating any child process (before the while()). It makes your process ignore the termination of the child processes and lets it run its database queries without undesired interruption.

Warning

The default handling of the SIGCHLD signal is to call wait() in order to let the system cleanup after the completed child process. What happens if the signal is not handled (because its delivery is blocked) is explained in the documentation of wait():

A child that terminates, but has not been waited for becomes a "zombie". The kernel maintains a minimal set of information about the zombie process (PID, termination status, resource usage information) in order to allow the parent to later perform a wait to obtain information about the child. As long as a zombie is not removed from the system via a wait, it will consume a slot in the kernel process table, and if this table fills, it will not be possible to create further processes. If a parent process terminates, then its "zombie" children (if any) are adopted by init(1), which automatically performs a wait to remove the zombies.

In plain English, if you block the reception of SIGCHLD signal, then you have to call pcntl_wait() in order to cleanup the zombie child processes.

You can add:

pcntl_wait($status, WNOHANG);

somewhere inside the while loop (just before it ends, for example).

Upvotes: 4

symcbean

Reputation: 48387

the script exits without any error or output

Not surprising when there's no error checking in the code. However if it really is "crashing", then:

if the cause is trapped by the PHP runtime then it will be trying to log an error. Have you tried delibertely creating an error scenario to varify that the reorting/logging is working as you expect?
if the error is not trapped by the PHP runtime, the the OS should be dumping a corefile - have you checked the OS config? Looked for the core file? Analyzed it?

$cmd = "nohup php $filePath > /tmp/1.log 2>&1 &";

This probably doesn't do what you think it does. When you run a process in the background with most versions of nohup, it still retains a relationship with the parent process; the parent cannot be reaped until the child process exits - and a child is always spawning another child before it does.

This is not a valid way to keep your code running in the background / as a daemon. What the right approach is depends on what you are trying to achieve. Is there a specific reason for attempting to renew the process every 60 seconds?

(You never explicitly close the database connection - this is less of an issue as PHP should do this when exit is invoked).

You might want to read this and this

Upvotes: 1