Reputation: 85
While reading this, I found a reasonable answer, which says:
Case 1: Directly Writing to File On Disk
100 times x 1 ms = 100 ms
I understood that. Next,
Case 3: Buffering in Memory before Writing to File on Disk
(100 times x 0.5 ms) + 1 ms = 51 ms
I didn't understand the 1 ms. What is the difference in between writing 100 data to disk and writing 1 data to disk? Why do both of them cost 1 ms?
Upvotes: 6
Views: 6051
Reputation: 16540
When writing 1 byte at a time, each write requires:
Repeating all the above for each byte (esp. since a disk is orders of magnitude slower than memory) takes a LOT of time.
It takes no longer to write a whole sector of data than to update a single byte.
That is why writing a buffer full of data is so very much faster than writing a series of individual bytes.
There are also other overheads like updating the inodes that:
Each of those directory and file inodes are updated each time the file is updated.
Those inodes are (simply) other sectors on the disk. Overall, lots of disk activity occurs each time a file is modified.
So modifying the file only once rather than numerous times is a major time saving. Buffering is the technique used to minimize the number of disk activities.
Upvotes: 2
Reputation: 134396
The disc access (transferring data to disk) does not happen byte-by-byte, it happens in blocks. So, we cannot conclude if that the time taken for writing 1
byte of data is 1
ms, then x
bytes of data will take x
ms. It is not a linear relation.
The amount of data written to the disk at a time depends on block size. For example, if a disc access cost you 1ms, and the block size is 512 bytes, then a write of size between 1 to 512 bytes will cost you same, 1 ms only.
So, coming back to the eqation, if you have , say 16 bytes of data to be written in each opeartion for 20 iteration, then,
time = (20
iteration * 1
ms) == 20
ms.
time = (20
iteration * 0.5
ms (bufferring time)) + 1
ms (to write all at once) = 10
+ 1
== 11
ms.
Upvotes: 16
Reputation: 12630
Among other things, data is written to disk in whole "blocks" only. A block is usually 512 bytes. Even if you only change a single byte inside the block, the OS and the disk will have to write all 512 bytes. If you change all 512 bytes in the block before writing, the actual write will be no slower than when changing only one byte.
The automatic caching inside the OS and/or the disk does in fact avoid this issue to a great extent. However, every "real" write operation requires a call from your program to the OS and probably all the way through to the disk driver. This takes some time. In comparison, writing into a char/byte/... array in your own process' memory in RAM does virtually cost nothing.
Upvotes: 0
Reputation: 2851
It is because of how the disc physical works. They can take larger buffers (called pages) and save them in one go. If you want to save the data all the time you need multiple alteration of one page, if you do it using buffer, you edit quickly accessible memory and then save everything in one go.
His example is explaining the costs of operation. For loading memory to data you have 100 operation of 0.5 s cost and then you have one of altering the disc (IO operation) what is not described in the answer and is probably not obvious, nearly all disc provide the bulk transfer alteration operation. So 1 IO operation means 1 save to a disc, not necessarily 1 bit save (it can be much more data).
Upvotes: 2