Reputation: 113
I have currently below code which downloads a zip file from blob using SAS URI, unzips it and uploads the content to a new container
var response = await new BlobClient(new Uri(sasUri)).DownloadAsync();
using (ZipArchive archive = new ZipArchive(response.Value.Content))
{
foreach (ZipArchiveEntry entry in archive.Entries)
{
BlobClient blobClient = _blobServiceClient.GetBlobContainerClient(containerName).GetBlobClient(entry.FullName);
using (var fileStream = entry.Open())
{
await blobClient.UploadAsync(fileStream, true);
}
}
}
The code for me fails with "stream too long" exception: System.IO.IOException: Stream was too long. at System.IO.MemoryStream.Write(Byte[] buffer, Int32 offset, Int32 count) at System.IO.Stream.CopyTo(Stream destination, Int32 bufferSize) at System.IO.Compression.ZipArchive.Init(Stream stream, ZipArchiveMode mode, Boolean leaveOpen).
My zip file size is 9G. What would be a better way to get around this exception? I'd like to avoid writing any files to disk.
Upvotes: 2
Views: 3273
Reputation: 113
Below solution worked for me. Instead of using DownloadAsync, use OpenReadAsync
var response = await new BlobClient(new Uri(sasUri)).OpenReadAsync(new BlobOpenReadOptions(false), cancellationToken);
using (ZipArchive archive = new ZipArchive(response))
{
foreach (ZipArchiveEntry entry in archive.Entries)
{
BlobClient blobClient = _blobServiceClient.GetBlobContainerClient(containerName).GetBlobClient($"{buildVersion}/{entry.FullName}");
using (var fileStream = entry.Open())
{
await blobClient.UploadAsync(fileStream, true, cancellationToken).ConfigureAwait(false);
}
}
}
Upvotes: 2
Reputation: 81473
So the issue here is
So, you will need to allow larger objects (somehow)
<gcAllowVeryLargeObjects>
project ElementCOMPlus_gcAllowVeryLargeObjects
However, putting 9 gigs of anything on the large object heap is problematic, it's inefficient for the GC among other issues, and you should really avoid the LOH as much as you can.
Note depending on the library, and what you have access to. There might be less LOHy ways to do this. If you can supply your own streams / data structures there are libraries which can break up buffers so they don't get allocated aggressively on the LOH via things like ReadOnlySequence
and Microsofts little known RecyclableMemoryStream
.
Upvotes: 1