Nishant
Nishant

Reputation: 905

Streams and its purpose in .Net

I am still not clear with the concept of Streams in .NET

FileStream for example:

using (FileStream fs = File.Open(C:\temp\Test.txt, FileMode.Open, FileAccess.Write, FileShare.None)) 

The above code gets me a FileStrem object. If my understanding is correct the FileStream object I get is a byte representation of the file C:\temp\Test.txt

My question: Is there a physical reference to the file C:\temp\Test.txt.

Is FileStream just an abstraction of the byte representation of the underlying file. If yes then can I pass this FileStream object to say a Webservice residing on some other windows machine.

Also when is it appropriate to use a stream. Consider a scenario where I need to read a file from some remote directory and SFTP it to some location. Does it make sense to create a FIleStream here?

Upvotes: 2

Views: 1295

Answers (2)

Matthew Haugen
Matthew Haugen

Reputation: 13286

The Stream type is fundamentally meant as a wrapper for I/O operations. That's its purpose. There's sometimes some fancy caching that goes on, and there are definitely such things as MemoryStreams which don't talk to any external objects, but essentially, the theory is that a stream is how you talk to those objects.

MSDN has a list of .NET Framework types that inherit from Stream, which is a bit too long to bother including here, but what you'll notice there is that for most of them, the goal is either to read or write from or to outside sources, or to process other streams in real-time as they do those operations.

It's important to remember that no, a stream is not just a byte array. It happens that byte arrays are just a really good way of reading data out of a stream. Network streams are a good example of this. Without caching turned on through any means, you don't have a way of moving back or forward in the stream artificially--you read the data, and that's it.

File streams let you jump around because the disk is sitting under you to do that kind of thing, but since NICs don't do caching on their own, networks can't.

As such, no, you can't pass a stream directly to a webservice. Essentially, in most cases, a stream is just a wrapper for a pointer (practically a driver) to some I/O operation. If the system even supported it, which it doesn't through any easy means, sending just the stream would be like emailing someone a link to a file on your C:\ drive.

What you can do, however, is copy data from one stream to another. For instance, you could copy data from a FileStream to a NetworkStream, thereby allowing you to transfer a file to a web service. The data will be buffered by the system on its way through, and basically read from one stream and directly written to another.

To better understand this concept of real-time data, look at an example. Imagine you're reading from the disk. This takes some leaps about how hard drives work that aren't acceptable or accurate, but in the interest of example, it's simple: you start at the beginning of a file, and read 200 bytes. The hard drive reads those 200 bytes, then stops. You then ask for another 100 bytes. The disk spins, then stops. Most noteworthy here, is that the disk doesn't read all of the file, then pass it to you. If it did that, then yes, a byte array would be a nicer tool to consume it.

The real goal here is what's held in memory. With a stream, you can process a huge amount of data, infinite really, in memory, without having to pull all that data directly into memory in the first place. You can read it chunk-by-chunk instead.

I don't know how familiar you are with LINQ or IEnumerables in general, but the theories here are the same--in LINQ, until you call some ToArray() or ToList(), your enumerable isn't processed. It sits with deferred execution, waiting for you to use it. That's how streams work, too, in most cases.

Upvotes: 6

Scott Chamberlain
Scott Chamberlain

Reputation: 127543

Is FileStream just an abstraction of the byte representation of the underlying file

No it is not, it is an abstraction of a reader or writer of the byte representation of the underlying file.

Stream provides a interface that allows you to read bytes in or write bytes out to a source without knowing what that source is. You could be reading a file or reading from a TCP/IP connection and your code could process both with 0 modifications if your code operated using a Stream.

You could not pass this to a Webservice on another machine because the Stream is only a reader, not the file itself so no actual information from the file would be transferred.

The time it is appropriate to use a stream is when either

  1. You need to abstract the data source or destination so you could use multiple sources without needing to write separate functions for the types of sources.
  2. You are working with large objects and you don't need to hold the entire object in memory at once in a byte[], only needing to access parts at a time via .Read( and .Write( calls loading or storing data in to smaller more manageable byte[]s.

In your scenario of the SFTP server you fall under the 2nd category. You don't need to wait till the entire file is loaded in memory as a byte[] before you start writing the file out the disk, you can get small chunks of data at a time from the SFTP's NetworkStream and write it to the disks FileStream. In fact stream already provides you a method to do this exact process for you with the method Stream.CopyTo(Stream destination).

Upvotes: 3

Related Questions