Reputation: 5326
I am doing some benchmarking with PHP file reading functions just for my overall knowledge. So I tested three different ways to read the whole content of a file that I thought would be very fast.
stdout
So here is my benchmarking code, note that I enabled the PHP cache system for readfile()
to avoid the direct output that would totally falsify the results.
<?php
/* Using a quick PNG file to benchmark with a big file */
/* file_get_contents() benchmark */
$start = microtime(true);
$foo = file_get_contents("bla.png");
$end = microtime(true) - $start;
echo "file_get_contents() time: " . $end . "s\n";
/* readfile() benchmark */
ob_start();
$start = microtime(true);
readfile('bla.png');
$end = microtime(true) - $start;
ob_end_clean();
echo "readfile() time: " . $end . "s\n";
/* exec('cat') benchmark */
$start = microtime(true);
$bar = exec('cat bla.png');
$end = microtime(true) - $start;
echo "exec('cat filename') time: " . $end . "s\n";
?>
I have ran this code several times to confirm the results shown and every time I had the same order. Here is an example of one of them:
$ php test.php
file_get_contents() time: 0.0006861686706543s
readfile() time: 0.00085091590881348s
exec('cat filename') time: 0.0048539638519287s
As you can see file_get_contents()
comes first then arrives readfile()
and finally cat
As for cat
even though it is a UNIX
command (so fast and everything :)) I understand that calling a separate binary may cause the relative high result.
But the thing I have some difficulty to understand is that why is file_get_contents()
faster than readfile()
? That's about 1.3 times slower after all.
Both functions are built-in and therefore pretty well optimized and since I enabled the cache, readfile() is not "trying" to output the data to stdout
but just like file_get_contents() it will put the data inside the RAM.
I am looking for a technical low-level explanation here to understand the pros and cons of file_get_contents()
and readfile()
besides the fact that one is designed to write directly to stdout whereas the other does a memory allocation inside the RAM.
Thanks in advance.
Upvotes: 5
Views: 10783
Reputation: 47
file_get_contents
function is generally considered faster than the readfile function when it comes to caching, as it allows data to be stored in the memory cache, whereas readfile writes the data directly to the output buffer, bypassing the memory cache.
This allows for the contents of the file to be easily manipulated and cached in memory, which can result in faster access times compared to the readfile function, which reads the file one chunk at a time and outputs the contents directly to the browser. file_get_contents
can take advantage of PHP's memory caching system (opcache).
If in some cases you can't use file_get_contents
, you can use the output buffering mechanism in PHP to cache the contents of a file before sending it to the client. This will allow you to use PHP's memory caching system with the readfile function. You can do this by starting an output buffer with the ob_start
function before calling readfile, then flushing the buffer with the ob_end_flush
function. This way, the contents of the file will be stored in the output buffer, which is part of PHP's memory caching system.
Upvotes: 1
Reputation: 26739
file_get_contents
only loads the data from the file in memory, while both readfile
and cat
also output the data on the screen, so they just perform more operations.
If you want to compare file_get_contents
to the others, add echo
before it
Also, you are not freeing the memory allocated for $foo. There is a chance that if you move the file_get_contents as last test, you will get different result.
Additionally, you are using output buffering, which also cause some difference - just try to add the rest of the functions in an output buffering code to remove any differences.
When comparing different functions, the rest of the code should be the same, otherwise you are open to all kinds of influences.
Upvotes: 7