binaryguy
binaryguy

Reputation: 1187

Extract tgz file in memory and access files in C#

I have a service that downloads a *.tgz file from a remote endpoint. I use SharpZipLib to extract and write the content of that compressed archive to disk. But now I want to prevent writing the files to disk (because that process doesn't have write permissions on that disk) and keep them in memory.

How can I access the decompressed files from memory? (Let's assume the archive holds simple text files)

Here is what I have so far:

    public void Decompress(byte[] byteArray)
    {
        Stream inStream = new MemoryStream(byteArray);
        Stream gzipStream = new GZipInputStream(inStream);

        TarArchive tarArchive = TarArchive.CreateInputTarArchive(gzipStream);
        tarArchive.ExtractContents(@".");
        tarArchive.Close();

        gzipStream.Close();
        inStream.Close();
    }

Upvotes: 2

Views: 3862

Answers (1)

pneuma
pneuma

Reputation: 977

Check this and this out.

Turns out, ExtractContents() works by iterating over TarInputStream. When you create your TarArchive like this:

TarArchive.CreateInputTarArchive(gzipStream);

it actually wraps the stream you're passing into a TarInputStream. Thus, if you want more fine-grained control over how you extract files, you must use TarInputStream directly.

See, if you can iterate over files, directories and actual file contents like this:

Stream inStream = new MemoryStream(byteArray);
Stream gzipStream = new GZipInputStream(inStream);

using (var tarInputStream = new TarInputStream(gzipStream))
{
    TarEntry entry;
    while ((entry = tarInputStream.GetNextEntry()) != null)
    {
        var fileName = entry.Name;
        using (var fileContents = new MemoryStream())
        {
            tarInputStream.CopyEntryContents(fileContents);
            
            // use entry, fileName or fileContents here
        }
    }
}

Upvotes: 4

Related Questions