Anand Shah
Anand Shah

Reputation: 14913

Create .tar.gz file using PHP

The project I am working on requires creating .tar.gz archives and feeding it to an external service. This external service works only with .tar.gz so another type archive is out of question. The server where the code I am working on will execute does not allow access to system calls. So system, exec, backticks etc. are no bueno. Which means I have to rely on pure PHP implementation to create .tar.gz files.

Having done a bit of research, it seems that PharData will be helpful to achieve the result. However I have hit a wall with it and need some guidance.

Consider the following folder layout:

parent folder
  - child folder 1
  - child folder 2
  - file1
  - file2

I am using the below code snippet to create the .tar.gz archive which does the trick but there's a minor issue with the end result, it doesn't contain the parent folder, but everything within it.

$pd = new PharData('archive.tar');

$dir = realpath("parent-folder");

$pd->buildFromDirectory($dir);

$pd->compress(Phar::GZ);

unset( $pd );

unlink('archive.tar');

When the archive is created it must contain the exact folder layout mentioned above. Using the above mentioned code snippet, the archive contains everything except the parent folder which is a deal breaker for the external service:

- child folder 1
- child folder 2
- file1
- file2

The description of buildFromDirectory does mention the following so it not containing the parent folder in the archive is understandable:

Construct a tar/zip archive from the files within a directory.

I have also tried using buildFromIterator but the end result with it also the same, i.e the parent folder isn't included in the archive. I was able to get the desired result using addFile but this is painfully slow.

Having done a bit more research I found the following library : https://github.com/alchemy-fr/Zippy . But this requires composer support which isn't available on the server. I'd appreciate if someone could guide me in achieving the end result. I am also open to using some other methods or library so long as its pure PHP implementation and doesn't require any external dependencies. Not sure if it helps but the server where the code will get executed has PHP 5.6

Upvotes: 6

Views: 4695

Answers (2)

xrobau
xrobau

Reputation: 1135

I ended up having to do this, and as this question is the first result on google for the problem here's the optimal way to do this, without using a regexp (which does not scale well if you want to extract one directory from a directory that contains many others).

function buildFiles($folder, $dir, $retarr = []) {
    $i = new DirectoryIterator("$folder/$dir");
    foreach ($i as $d) {
        if ($d->isDot()) {
            continue;
        }
        if ($d->isDir()) {
            $newdir = "$dir/" . basename($d->getPathname());
            $retarr = buildFiles($folder, $newdir, $retarr);
        } else {
            $dest = "$dir/" . $d->getFilename();
            $retarr[$dest] = $d->getPathname();
        }
    }
    return $retarr;
}
 
$out = "/tmp/file.tar";
$sourcedir = "/data/folder";
$subfolder = "folder2";
$p = new PharData($out);
$filemap = buildFiles($sourcedir, $subfolder);
$iterator = new ArrayIterator($filemap);
$p->buildFromIterator($iterator);
$p->compress(\Phar::GZ);
unlink($out); // $out.gz has been created, remove the original .tar

This allows you to pick /data/folder/folder2 from /data/folder, even if /data/folder contains several million OTHER folders. It then creates a tar.gz with the contents all being prepended with the folder name.

Upvotes: 3

user3942918
user3942918

Reputation: 26375

Use the parent of "parent-folder" as the base for Phar::buildFromDirectory() and use its second parameter to limit the results only to "parent-folder", e.g.:

$parent = dirname("parent-folder");
$pd->buildFromDirectory($parent, '#^'.preg_quote("$parent/parent-folder/", "#").'#');
$pd->compress(Phar::GZ);

Upvotes: 4

Related Questions