Reputation: 452
According to the man page, we can specify the amount of bytes we want to read from a file descriptor.
But in the read's implementation, how many read requests will be created to perform a read?
For example, if I want to read 4MB, will it create only one request for 4MB or will it split it into multiple small requests? such as 4KB per request?
Upvotes: 4
Views: 2987
Reputation: 3500
read(2) is a system call, so it calls the vDSO shared library to dispatch the system call (in very old times it used to be an interrupt, but nowadays there are faster ways of dispatching system calls).
inside the kernel the call is first handled by the vfs (virtual file system); the virtual file system provides a common interface for inodes (the structures that represents open files) and a common way of interfacing with the underlying file system.
the vfs dispatches to the underlying file system (the mount(8) program will tell you which mount point exists and what file system is used there). (see here for more information http://www.inf.fu-berlin.de/lehre/SS01/OS/Lectures/Lecture16.pdf )
the file system can do its own caching, so number of disk reads depends on what is present in the cache and how the file system allocates blocks for storage of a particular file and how the file is divided into disk blocks - all questions to the particular file system)
If you want to do your own caching then open the file with O_DIRECT flag; in this case there is an effort not to use the cache; however all reads have to be aligned to 512 offsets and come in multiples of 512 size (this is in order that your buffer can be transfered via DMA to the backing store http://www.quora.com/Why-does-O_DIRECT-require-I-O-to-be-512-byte-aligned )
Upvotes: 3
Reputation: 182827
There's really no one right answer, other than however many are necessary what whatever layer the request winds up going to. Typically, a single request will be passed to the kernel. This may result in no further requests going to other layers because all the information is in memory. But if the data has to be read from, say, a software RAID, requests may have to be issued to multiple physical devices to satisfy the request.
I don't think you can really give a better answer than "whatever the implementer thought was was the best way".
Upvotes: 1
Reputation: 239251
It depends on how deep you go.
The C library just passes the size you gave it straight to the kernel in one read()
system call, so at that level it's just one request.
Inside the kernel, for an ordinary file in standard buffered mode the 4MB you requested is going to be copied from multiple pagecache pages (4kB each) which are unlikely to be contiguous. Any of the file data which isn't actually already in the pagecache is going to have to be read from disk. The file might not be stored contiguously on disk, so that 4MB could result in multiple requests to the underlying block device.
Upvotes: 2
Reputation: 540
When you call read
it only make just one request to fill the buffer size and if it couldn't to fill all the buffer (no more data or data is not arrived like in sockets) it returns the number of bytes it actually wrote in your buffer.
As the manual says:
RETURN VALUE
Upon successful completion, these functions shall return a non-negative integer indicating the number of bytes actually read. Otherwise, the functions shall return −1 and set errno to indicate the error.
Upvotes: 1
Reputation: 126418
If there is data available, read will return as much data as is immediately available and will fit in the buffer, without waiting. If there's no data available, it will wait until there is some and return what it can without waiting more.
How much that is depends on what the file descriptor refers to. If it refers to a socket, that will be whatever is in the socket buffer. If it is a file, that will be whatever is in the buffer cache.
Upvotes: 1