Andrei Barbolin
Andrei Barbolin

Reputation: 435

Azure Data Lake Store - error while reading from file

I use FileSystemOperationsExtensions.Open method that returns Stream and I can read from it. Sometimes when service is reading big files from the stream (~150-300 Mb) service gets the following exceptions:

System.IO.IOException: The read operation failed, see inner exception. ---> System.Net.WebException: The request was aborted: The request was canceled.
at System.Net.ConnectStream.Read(Byte[] buffer, Int32 offset, Int32 size)
at System.Net.Http.HttpClientHandler.WebExceptionWrapperStream.Read(Byte[] buffer, Int32 offset, Int32 count)

"ClassName": "System.IO.IOException",
"Message": "Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host."
 at System.Net.ConnectStream.Read(Byte[] buffer, Int32 offset, Int32 size)\r\n   
 at System.Net.Http.HttpClientHandler.WebExceptionWrapperStream.Read(Byte[] buffer, Int32 offset, Int32 count)

And it occurs randomly. Also, I create an object of DataLakeStoreFileSystemManagementClient class with 60 minutes timeout, but these errors occur before it. It may take 3, 10, 20 or whatever minutes. Of course, I can reread stream with offset, but it requires extra time for development. Perhaps there is another way to avoid these exceptions. Could anybody help me with it?

Upvotes: 1

Views: 1506

Answers (1)

Tom Sun
Tom Sun

Reputation: 24569

I do a demo test with 270M+ size file 3 times, it always works correctly for me. Please have a try to using the following code to test it. We also can get more datalake store demo code from data lake store get started net sdk.

enter image description here

Demo code:

var applicationId = "Application Id";
                var secretKey = "secretkey";
                var tenantId = "tenant id";
                var adlsAccountName = "Account name";
                var creds = ApplicationTokenProvider.LoginSilentAsync(tenantId, applicationId, secretKey).Result;
                var adlsFileSystemClient = new DataLakeStoreFileSystemManagementClient(creds,clientTimeoutInMinutes:60);
                var srcPath = "/mytempdir/ForDemoCode.zip";
                var destPath = @"c:\tom\ForDemoCode1.zip";

                Stopwatch stopWatch = new Stopwatch();
                stopWatch.Start();
                using (var stream = adlsFileSystemClient.FileSystem.Open(adlsAccountName, srcPath))
                using (var fileStream = new FileStream(destPath, FileMode.Create))
                {
                    stream.CopyTo(fileStream);
                }
                var file = new FileInfo(destPath);
                Console.WriteLine($"File size :{file.Length}");
                stopWatch.Stop();
                // Get the elapsed time as a TimeSpan value.
                TimeSpan ts = stopWatch.Elapsed;
                // Format and display the TimeSpan value.
                string elapsedTime = $"{ts.Hours:00}:{ts.Minutes:00}:{ts.Seconds:00}.{ts.Milliseconds/10:00}";
                Console.WriteLine("RunTime " + elapsedTime);
                Console.ReadKey();

package config file:

<?xml version="1.0" encoding="utf-8"?>
<packages>
  <package id="Microsoft.Azure.Management.DataLake.Store" version="2.1.1-preview" targetFramework="net452" />
  <package id="Microsoft.Azure.Management.DataLake.StoreUploader" version="1.0.0-preview" targetFramework="net452" />
  <package id="Microsoft.IdentityModel.Clients.ActiveDirectory" version="3.13.8" targetFramework="net452" />
  <package id="Microsoft.Rest.ClientRuntime" version="2.3.5" targetFramework="net452" />
  <package id="Microsoft.Rest.ClientRuntime.Azure" version="3.3.5" targetFramework="net452" />
  <package id="Microsoft.Rest.ClientRuntime.Azure.Authentication" version="2.2.0-preview" targetFramework="net452" />
  <package id="Newtonsoft.Json" version="9.0.2-beta1" targetFramework="net452" />
</packages>

Upvotes: 1

Related Questions