Reputation: 48899
I've to parse a lot (10000+) of remote gzipped files. Each zipped file should contain a CSV inside it (maybe in a folder). Right now I'm able to get the body, check for content type and uncompress it, obtaining application/octet-stream
.
Question is: what's the octet-stream and how can I check for files or folders inside it?
/** @var $guzzle \Guzzle\Http\Client */
$guzzle = $this->getContainer()->get('guzzle');
$request = $guzzle->get($url);
try {
$body = $request->send()->getBody();
// Check for body content-type
if('application/z-gzip' === $body->getContentType()) {
$body->uncompress();
$body->getContentType(); // application/octet-stream
}
else {
// Log and skip current remote file
}
}
catch(\Exception $e) {
$output->writeln("Failed: {$guzzle->getBaseUrl()}");
throw $e;
}
Upvotes: 0
Views: 6237
Reputation: 5266
The EntityBody object that stores the body can only guess the content-type of local files. Use the content-length header of the response to get a more accurate value.
Something like this:
$response = $request->send();
$type = $response->getContentType();
Upvotes: 1
Reputation: 597
You should be able to use the built in gzuncompress function.
See http://php.net/manual/en/function.gzuncompress.php
Edit: Or other zlib functions depending on what data you are working with. http://php.net/manual/en/ref.zlib.php
Upvotes: 0
Reputation: 105
Something like some shell command will work for u
shell_exec('gzip -d your_file.gz');
You can first unzip all your files in a particular directory and then can read each file or whatever computation you have to perform.
As a sidenote :
Take care where the command is run from (ot use a swith to tell "decompress to that directory") You might want to take a look at escapeshellarg too ;-)
Upvotes: 0