PaulH
PaulH

Reputation: 3049

PHP flock() for read-modify-write does not work

I have a log file maintained by a PHP script. The PHP script is subject to parallel processing. I cannot get the flock() mechanism to work on the log file: in my case, flock() does not prevent the log file shared by PHP scripts running in parallel from being accessed at the same time and being sometimes overwritten.

I want to be able to read a file, do some processing, modify the data and write back without the same code running in parallel on the server doing the same at the same time. The read modify write has to be in sequence.

On one of my shared hostings (OVH France), it does not work as expected. In that case, we see that the counter $c has the same value in different iframes, which should not be possible if the lock works as expected, which it does on an other shared hosting.

Any suggestions to make this work, or for an alternative method?

Googling "read modify write" php or fetch and add or test and set did not provide useful information: all solutions are based on a working flock().

Here is some standalone running demo code to illustrate. It generates a number of parallel requests from the browser to the server and displays the results. It is easy to visually observe a disfunction: if your webserver does not support flock() like one of mine, the counter value and the number of log lines will be the same in some frames.

<!DOCTYPE html>
<html lang="en">
<title>File lock test</title>
<style>
iframe {
    width: 10em;
    height: 300px;
}
</style>
<?php
$timeStart = microtime(true);
if ($_GET) { // iframe
    // GET
    $time = $_GET['time'] ?? 'no time';
    $instance = $_GET['instance'] ?? 'no instance';

    // open file
    // $mode = 'w+'; // no read
    // $mode = 'r+'; // does not create file, we have to lock file creation also
    $mode = 'c+'; // read, write, create
    $fhandle = fopen(__FILE__ .'.rwtestfile.txt', $mode) or exit('fopen');
    // lock
    flock($fhandle, LOCK_EX) or exit('flock');
    // start of file (optional, only some modes like require it)
    rewind($fhandle);
    // read file (or default initial value if new file)
    $fcontent = fread($fhandle, 10000) or ' 0';
    // counter value from previous write is last integer value of file
    $c = strrchr($fcontent, ' ') + 1;
    // new line for file
    $fcontent .= "<br />\n$time $instance $c";
    // reset once in a while
    if ($c > 20) {
        $fcontent = ' 0'; // avoid long content
    }
    // simulate other activity
    usleep(rand(1000, 2000));
    // start of file
    rewind($fhandle);
    // write
    fwrite($fhandle, $fcontent) or exit('fwrite');
    // truncate (in unexpected case file is shorter now)
    ftruncate($fhandle, ftell($fhandle)) or exit('ftruncate');
    // close
    fclose($fhandle) or exit('fclose');
    // echo
    echo "instance:$instance c:$c<br />";
    echo $timeStart ."<br />";
    echo microtime(true) - $timeStart ."<br />";
    echo $fcontent ."<br />";
} else {
    echo 'File lock test<br />';
    // iframes that will be requested in parallel, to check flock
    for ($i = 0; $i < 14; $i++) {
        echo '<iframe src="?instance='. $i .'&time='. date('H:i:s') .'"></iframe>'."\n";
    }
}

There is a warning about flock() limitations in the PHP: flock - Manual, but it is about ISAPI (Windows) and FAT (Windows). My server configuration is:
PHP Version 7.2.5
System: Linux cluster026.gra.hosting.ovh.net
Server API: CGI/FastCGI

Upvotes: 3

Views: 1596

Answers (3)

PaulH
PaulH

Reputation: 3049

There is one fopen() test and set mode: the x mode.

x Create and open for writing only; place the file pointer at the beginning of the file. If the file already exists, the fopen() call will fail by returning FALSE and generating an error of level E_WARNING. If the file does not exist, attempt to create it.

The fopen($filename ,'x') behaviour is the same as mkdir() and it can be used in the same way:

<?php
// lock
$fnLock = __FILE__ .'.lock'; // lock file filename
$lockLooping = 0; // counter can be used for tuning depending on lock duration
do {
    if ($lockHandle = @fopen($fnLock, 'x')) { // test and set command
        $lockLooping = 0;
    } else {
        $lockLooping += 1;
        $lockAge = time() - filemtime($fnLock);
        if ($lockAge > 10) {
            rmdir($fnLock); // robustness, in case a lock was not erased                
        } else {
            // wait without consuming CPU before try again
            usleep(rand(2500, 25000)); // random to avoid parallel process conflict again
        }
    }
} while ($lockLooping > 0);

// do stuff under atomic protection
// don't take too long, because parallel processes are waiting for the unlock (rmdir)

$content = file_get_contents($protected_file_name);  // example read
$content = $modified_content; // example modify
file_put_contents($protected_file_name, $modified_content); // example write

// unlock
fclose($lockHandle);
unlink($fnLock);

It is a good idea to test this, e.g. using the code in the question. Many people rely on locking as documented, but surprises may appear during test or production under load (parallel requests from one browser may be enough).

Upvotes: 0

PaulH
PaulH

Reputation: 3049

A way to do an atomic test and set instruction in PHP is to use mkdir(). It is a bit strange to use a directory for that instead of a file, but mkdir() will create a directory or return a false (and a suppressile warning) if it already exists. File commands like fopen(), fwrite(), file_put_contents() do not test and set in one instruction.

<?php
// lock
$fnLock = __FILE__ .'.lock'; // lock directory filename
$lockLooping = 0; // counter can be used for tuning depending on lock duration
do {
    if (@mkdir($fnLock, 0777)) { // mkdir is a test and set command
        $lockLooping = 0;
    } else {
        $lockLooping += 1;
        $lockAge = time() - filemtime($fnLock);
        if ($lockAge > 10) {
            rmdir($fnLock); // robustness, in case a lock was not erased                
        } else {
            // wait without consuming CPU before try again
            usleep(rand(2500, 25000)); // random to avoid parallel process conflict again
        }
    }
} while ($lockLooping > 0);

// do stuff under atomic protection
// don't take too long, because parallel processes are waiting for the unlock (rmdir)

$content = file_get_contents($protected_file_name);  // example read
$content = $modified_content; // example modify
file_put_contents($protected_file_name, $modified_content); // example write

// unlock
rmdir($fnLock);

Upvotes: 2

symcbean
symcbean

Reputation: 48357

Using files for data management coordinated only by PHP request handlers you are heading for a world of pain - you've only just dipped your toes in the water thus far.

Using LOCK_EX, your writer needs to wait for any (and every) instance of LOCK_SH to be released before it will acquire the lock. Here you are setting flock to block until the lock can be acquired. On a relatively busy system, the writer could be blocked indefinitely. There is no priority queuing of locks on most OS that would place any subsequent reader requesting the lock behind a process waiting for a write lock.

A further complication is that you can only use flock on an open file handle. Meaning that a opening the file and acquiring the lock is not atomic, further you need to flush the stat cache in order to determine the age of the file after acquiring the lock.

Any writes to the file (even using file_put_contents()) are not atomic. So in the absence of exclusive locking you can't be sure that nobody will read a partial file.

In the absence of additional components (e.g. a daemon providing a lock queuing mechanism, or a caching reverse proxy in front of the web server, or a relational database) then your only option is to assume that you cannot ensure exclusive access and use atomic operations to semaphore the file, something like:

 $lock_age=time()-filectime(dirname(CACHE_FILE) . "/lock");
 if (filemtime(CACHE_FILE)>time()-CACHE_TTL 
       && $lock_age>MAX_LOCK_TIME) {
          rmdir(dirname(CACHE_FILE) . "/lock");
          mkdir(dirname(CACHE_FILE) . "/lock") || die "I give up";
      }
      $content=generate_content(); // might want to add specific timing checks around this
      file_put_contents(CACHE_FILE, $content);
      rmdir(dirname(CACHE_FILE) . "/lock");
 } else if (is_dir(dirname(CACHE_FILE) . "/lock") {
      $snooze=MAX_LOCK_TIME-$lock_age;
      sleep($snooze);
      $content=file_get_contents(CACHE_FILE);
 } else {
      $content=file_get_contents(CACHE_FILE);
 }

(note that this is a really ugly hack)

Upvotes: 0

Related Questions