Reputation: 659
I need to get the file size of a file over 2 GB in size. (testing on 4.6 GB file). Is there any way to do this without an external program?
Current status:
filesize()
, stat()
and fseek()
failsfread()
and feof()
worksThere is a possibility to get the file size by reading the file content (extremely slow!).
$size = (float) 0;
$chunksize = 1024 * 1024;
while (!feof($fp)) {
fread($fp, $chunksize);
$size += (float) $chunksize;
}
return $size;
I know how to get it on 64-bit platforms (using fseek($fp, 0, SEEK_END)
and ftell()
), but I need solution for 32-bit platform.
Solution: I've started open-source project for this.
Big File Tools is a collection of hacks that are needed to manipulate files over 2 GB in PHP (even on 32-bit systems).
Upvotes: 26
Views: 20296
Reputation: 9916
Here's one possible method:
It first attempts to use a platform-appropriate shell command (Windows shell substitution modifiers or *nix/Mac stat
command). If that fails, it tries COM (if on Windows), and finally falls back to filesize()
.
/*
* This software may be modified and distributed under the terms
* of the MIT license.
*/
function filesize64($file)
{
static $iswin;
if (!isset($iswin)) {
$iswin = (strtoupper(substr(PHP_OS, 0, 3)) == 'WIN');
}
static $exec_works;
if (!isset($exec_works)) {
$exec_works = (function_exists('exec') && !ini_get('safe_mode') && @exec('echo EXEC') == 'EXEC');
}
// try a shell command
if ($exec_works) {
$cmd = ($iswin) ? "for %F in (\"$file\") do @echo %~zF" : "stat -c%s \"$file\"";
@exec($cmd, $output);
if (is_array($output) && ctype_digit($size = trim(implode("\n", $output)))) {
return $size;
}
}
// try the Windows COM interface
if ($iswin && class_exists("COM")) {
try {
$fsobj = new COM('Scripting.FileSystemObject');
$f = $fsobj->GetFile( realpath($file) );
$size = $f->Size;
} catch (Exception $e) {
$size = null;
}
if (ctype_digit($size)) {
return $size;
}
}
// if all else fails
return filesize($file);
}
Upvotes: 23
Reputation: 659
I've started project called Big File Tools. It is proven to work on Linux, Mac and Windows (even 32-bit variants). It provides byte-precise results even for huge files (>4GB). Internally it uses brick/math - arbitrary-precision arithmetic library.
Install it using composer.
composer install jkuchar/BigFileTools
and use it:
<?php
$file = BigFileTools\BigFileTools::createDefault()->getFile(__FILE__);
echo $file->getSize() . " bytes\n";
Result is BigInteger so you can compute with results
$sizeInBytes = $file->getSize();
$sizeInMegabytes = $sizeInBytes->toBigDecimal()->dividedBy(1024*1024, 2, \Brick\Math\RoundingMode::HALF_DOWN);
echo "Size is $sizeInMegabytes megabytes\n";
Big File Tools internally uses drivers to reliably determine exact file size on all platforms. Here is list of available drivers (updated 2016-02-05)
| Driver | Time (s) ↓ | Runtime requirements | Platform
| --------------- | ------------------- | -------------- | ---------
| CurlDriver | 0.00045299530029297 | CURL extension | -
| NativeSeekDriver | 0.00052094459533691 | - | -
| ComDriver | 0.0031449794769287 | COM+.NET extension | Windows only
| ExecDriver | 0.042937040328979 | exec() enabled | Windows, Linux, OS X
| NativeRead | 2.7670161724091 | - | -
You can use BigFileTools with any of these or fastest available is chosen by default (BigFileTools::createDefault()
)
use BigFileTools\BigFileTools;
use BigFileTools\Driver;
$bigFileTools = new BigFileTools(new Driver\CurlDriver());
Upvotes: 8
Reputation: 37
Well easyest way to do that would be to simply add a max value to your number. This means on x86 platform long number add 2^32:
if($size < 0) $size = pow(2,32) + $size;
example: Big_File.exe - 3,30Gb (3.554.287.616 b) your function returns -740679680 so you add 2^32 (4294967296) and get 3554287616.
You get negative number because your system reserves one bit of memory to the negative sign, so you are left with 2^31 (2.147.483.648 = 2G) maximum value of either negative or positive number. When system reaches this maximum value it doesn't stop but simply overwrites that last reserved bit and your number is now forced to negative. In simpler words, when you exceed maximum positive number you will be forced to maximum negative number, so 2147483648 + 1 = -2147483648. Further addition goes towards zero and again towards maximum number.
As you can see it is like a circle with highest and lowest numbers closing the loop.
Total maximum number that x86 architecture can "digest" in one tick is 2^32 = 4294967296 = 4G, so as long as your number is lower than that, this simple trick will always work. In higher numbers you must know how many times you have passed the looping point and simply multiply it by 2^32 and add it to your result:
$size = pow(2,32) * $loops_count + $size;
Ofcourse in basic PHP functions this is quite hard to do, because no function will tell you how many times it has passed the looping point, so this won't work for files over 4Gigs.
Upvotes: 1
Reputation: 2655
$file_size=sprintf("%u",filesize($working_dir."\\".$file));
This works for me on a Windows Box.
I was looking through the bug log here: https://bugs.php.net/bug.php?id=63618 and found this solution.
Upvotes: 4
Reputation: 4875
I found a nice slim solution for Linux/Unix only to get the filesize of large files with 32-bit php.
$file = "/path/to/my/file.tar.gz";
$filesize = exec("stat -c %s ".$file);
You should handle the $filesize
as string. Trying to casting as int results in a filesize = PHP_INT_MAX if the filesize is larger than PHP_INT_MAX.
But although handled as string the following human readable algo works:
formatBytes($filesize);
public function formatBytes($size, $precision = 2) {
$base = log($size) / log(1024);
$suffixes = array('', 'k', 'M', 'G', 'T');
return round(pow(1024, $base - floor($base)), $precision) . $suffixes[floor($base)];
}
so my output for a file larger than 4 Gb is:
4.46G
Upvotes: 2
Reputation: 370
Below code works OK for any filesize on any version of PHP / OS / Webserver / Platform.
// http head request to local file to get file size
$opts = array('http'=>array('method'=>'HEAD'));
$context = stream_context_create($opts);
// change the URL below to the URL of your file. DO NOT change it to a file path.
// you MUST use a http:// URL for your file for a http request to work
// SECURITY - you must add a .htaccess rule which denies all requests for this database file except those coming from local ip 127.0.0.1.
// $tmp will contain 0 bytes, since its a HEAD request only, so no data actually downloaded, we only want file size
$tmp= file_get_contents('http://127.0.0.1/pages-articles.xml.bz2', false, $context);
$tmp=$http_response_header;
foreach($tmp as $rcd) if( stripos(trim($rcd),"Content-Length:")===0 ) $size= floatval(trim(str_ireplace("Content-Length:","",$rcd)));
echo "File size = $size bytes";
// example output
File size = 10082006833 bytes
Upvotes: 0
Reputation: 1319
I wrote an function which returns the file size exactly and is quite fast:
function file_get_size($file) {
//open file
$fh = fopen($file, "r");
//declare some variables
$size = "0";
$char = "";
//set file pointer to 0; I'm a little bit paranoid, you can remove this
fseek($fh, 0, SEEK_SET);
//set multiplicator to zero
$count = 0;
while (true) {
//jump 1 MB forward in file
fseek($fh, 1048576, SEEK_CUR);
//check if we actually left the file
if (($char = fgetc($fh)) !== false) {
//if not, go on
$count ++;
} else {
//else jump back where we were before leaving and exit loop
fseek($fh, -1048576, SEEK_CUR);
break;
}
}
//we could make $count jumps, so the file is at least $count * 1.000001 MB large
//1048577 because we jump 1 MB and fgetc goes 1 B forward too
$size = bcmul("1048577", $count);
//now count the last few bytes; they're always less than 1048576 so it's quite fast
$fine = 0;
while(false !== ($char = fgetc($fh))) {
$fine ++;
}
//and add them
$size = bcadd($size, $fine);
fclose($fh);
return $size;
}
Upvotes: -1
Reputation: 430
You can't reliably get the size of a file on a 32 bit system by checking if filesize() returns negative, as some answers suggest. This is because if a file is between 4 and 6 gigs on a 32 bit system filesize will report a positive number, then negative from 6 to 8 then positive from 8 to 10 and so on. It loops, in a manner of speaking.
So you're stuck using an external command that works reliably on your 32 bit system.
However, one very useful tool is the ability to check if the file size is bigger than a certain size and you can do this reliably on even very big files.
The following seeks to 50 megs and tries to read one byte. It is very fast on my low spec test machine and works reliably even when the size is much greater than 2 gigs.
You can use this to check if a file is greater than 2147483647 bytes (2147483648 is max int on 32 bit systems) and then handle the file differently or have your app issue a warning.
function isTooBig($file){
$fh = @fopen($file, 'r');
if(! $fh){ return false; }
$offset = 50 * 1024 * 1024; //50 megs
$tooBig = false;
if(fseek($fh, $offset, SEEK_SET) === 0){
if(strlen(fread($fh, 1)) === 1){
$tooBig = true;
}
} //Otherwise we couldn't seek there so it must be smaller
fclose($fh);
return $tooBig;
}
Upvotes: 0
Reputation: 49
<?php
######################################################################
# Human size for files smaller or bigger than 2 GB on 32 bit Systems #
# size.php - 1.1 - 17.01.2012 - Alessandro Marinuzzi - www.alecos.it #
######################################################################
function showsize($file) {
if (strtoupper(substr(PHP_OS, 0, 3)) == 'WIN') {
if (class_exists("COM")) {
$fsobj = new COM('Scripting.FileSystemObject');
$f = $fsobj->GetFile(realpath($file));
$file = $f->Size;
} else {
$file = trim(exec("for %F in (\"" . $file . "\") do @echo %~zF"));
}
} elseif (PHP_OS == 'Darwin') {
$file = trim(shell_exec("stat -f %z " . escapeshellarg($file)));
} elseif ((PHP_OS == 'Linux') || (PHP_OS == 'FreeBSD') || (PHP_OS == 'Unix') || (PHP_OS == 'SunOS')) {
$file = trim(shell_exec("stat -c%s " . escapeshellarg($file)));
} else {
$file = filesize($file);
}
if ($file < 1024) {
echo $file . ' Byte';
} elseif ($file < 1048576) {
echo round($file / 1024, 2) . ' KB';
} elseif ($file < 1073741824) {
echo round($file / 1048576, 2) . ' MB';
} elseif ($file < 1099511627776) {
echo round($file / 1073741824, 2) . ' GB';
} elseif ($file < 1125899906842624) {
echo round($file / 1099511627776, 2) . ' TB';
} elseif ($file < 1152921504606846976) {
echo round($file / 1125899906842624, 2) . ' PB';
} elseif ($file < 1180591620717411303424) {
echo round($file / 1152921504606846976, 2) . ' EB';
} elseif ($file < 1208925819614629174706176) {
echo round($file / 1180591620717411303424, 2) . ' ZB';
} else {
echo round($file / 1208925819614629174706176, 2) . ' YB';
}
}
?>
Use as follow:
<?php include("php/size.php"); ?>
And where you want:
<?php showsize("files/VeryBigFile.rar"); ?>
If you want improve it you are welcome!
Upvotes: 4
Reputation: 9
When IEEE double is used (very most of systems), file sizes below ~4EB (etabytes = 10^18 bytes) do fit into double as precise numbers (and there should be no loss of precision when using standard arithmetic operations).
Upvotes: 0
Reputation: 165201
One option would be to seek to the 2gb mark and then read the length from there...
function getTrueFileSize($filename) {
$size = filesize($filename);
if ($size === false) {
$fp = fopen($filename, 'r');
if (!$fp) {
return false;
}
$offset = PHP_INT_MAX - 1;
$size = (float) $offset;
if (!fseek($fp, $offset)) {
return false;
}
$chunksize = 8192;
while (!feof($fp)) {
$size += strlen(fread($fp, $chunksize));
}
} elseif ($size < 0) {
// Handle overflowed integer...
$size = sprintf("%u", $size);
}
return $size;
}
So basically that seeks to the largest positive signed integer representable in PHP (2gb for a 32 bit system), and then reads from then on using 8kb blocks (which should be a fair tradeoff for best memory efficiency vs disk transfer efficiency).
Also note that I'm not adding $chunksize
to size. The reason is that fread
may actually return more or fewer bytes than $chunksize
depending on a number of possibilities. So instead, use strlen
to determine the length of the parsed string.
Upvotes: 0
Reputation: 2691
you may want to add some alternatives to the function you use such as calling system functions such as "dir" / "ls" and get the information from there. They are subject of security of course, things you can check and eventually revert to the slow method as a last resort only.
Upvotes: 0
Reputation: 15425
If you have an FTP server you could use fsockopen:
$socket = fsockopen($hostName, 21);
$t = fgets($socket, 128);
fwrite($socket, "USER $myLogin\r\n");
$t = fgets($socket, 128);
fwrite($socket, "PASS $myPass\r\n");
$t = fgets($socket, 128);
fwrite($socket, "SIZE $fileName\r\n");
$t = fgets($socket, 128);
$fileSize=floatval(str_replace("213 ","",$t));
echo $fileSize;
fwrite($socket, "QUIT\r\n");
fclose($socket);
(Found as a comment on the ftp_size page)
Upvotes: 0