Reputation: 7179
When bytes are written to a file, the kernel does not immediately write those bytes to disk but stores the bytes in dirtied pages in the page cache (write-back caching).
The question is if a file read is issued before the dirty pages are flushed to disk, will the bytes be served from the dirtied pages in the cache or will the dirty pages first be flushed to disk followed by a disk read to serve the bytes (storing them in the page cache in the process)?
Upvotes: 2
Views: 1381
Reputation: 1020
i disagree with answers above.
consider the following practical example: we open a file for append writing, we make write and do not close file descriptor, after that we make read. the process do not see new data.
create python script
#!/usr/bin/python2
import time
from datetime import datetime
f = open("demo.txt", "a")
while True:
time.sleep(1)
now = datetime.now()
print ("===========")
print ("write to file => date=%s") % (now)
f.write(repr(now))
print ("===========")
print ("read from file :")
f2 = open("demo.txt", "r")
print(f2.read())
f2.close()
f.close()
create new file. fill with with some content.
# touch demo.txt
# echo "Initial content!" > ./demo.txt
launch python scrypt
# ./p1.py
===========
write to file => date=2023-07-06 14:23:14.611576
===========
read from file :
Initial content!
===========
write to file => date=2023-07-06 14:23:15.613440
===========
read from file :
Initial content!
===========
write to file => date=2023-07-06 14:23:16.616518
===========
read from file :
Initial content!
===========
write to file => date=2023-07-06 14:23:17.618961
===========
read from file :
Initial content!
So when we write to a file and try to read from the file we can not see new data that reside in page cache. Of course when we close write descriptor we finally see new data.
Also if you try to read from another process you also cant see any new data.
Upvotes: 0
Reputation: 4432
The file read will fetch data from page cache without writing to disk. From Linux Kernel Development 3rd Edition by Robert Love:
Whenever the kernel begins a read operation—for example, when a process issues the read() system call—it first checks if the requisite data is in the page cache. If it is, the kernel can forgo accessing the disk and read the data directly out of RAM.This is called a cache hit. If the data is not in the cache, called a cache miss, the kernel must schedule block I/O operations to read the data off the disk.
Writeback to disk happens periodically, separate from read:
The third strategy, employed by Linux, is called write-back. In a write-back cache, processes perform write operations directly into the page cache.The backing store is not immediately or directly updated. Instead, the written-to pages in the page cache are marked as dirty and are added to a dirty list. Periodically, pages in the dirty list are written back to disk in a process called writeback, bringing the on-disk copy in line with the inmemory cache.
Upvotes: 5
Reputation: 66153
From the view of application developer, it is reasonable to assume that a read coming after a write will get information stored by the write.
Linux provide such garantee, hiding implementation details. So, whether caching is used or not, effect of the write is the same: futher read will return what have been issued to write.
Upvotes: -1