Reputation: 401
Context :
OS : Red hat 8.X
File systems : EXT4, XFS
Storage Types : SSD, HDD
Corruption : Meant here is an activity that result in written data cannot be retrieved as it was written. .e.g. Disk Device level corruption.
Linux read call signature is ssize_t read(int fd, void buf[.count], size_t count);
.
Say the file referred by fd
, has corrupted segments (+ NOT corrupted segments). If the read request goes through one or more corrupted segments(assume segments are A(OK)--B(corrupted)--C(OK)--D(corrupted)--E(OK) and fd
's file position is set before the beginning of A and "count" is large enough to contain all A -> E segments),
Is there a possibility of read's return value to be larger than ZERO ? (and buf
to contain data) ?
If so,
1.1. What would be contained in buf
? will it contain any data from corrupted segments B and D ? What could be the return value of read' ?
1.2 What are probability of this happening ? What factors could increase the probability of this happening ? e.g. re-boot ?
Would the file size returned by fstat count any bytes from corrupted segments ?
Purpose : I am trying to decide(under above given OS, File system conditions), if I NEED to add a "application level calculated checksum" along with written(binary) data and when reading the same file if read returns success(i.e. return value > 0), validate the (app level written)checksum before concluding data as valid.
Also I am NOT worried about some intruder modifying the written data here. Only worried about things that can happen from system activity. e.g. machine re-boot
Upvotes: 0
Views: 121
Reputation: 76489
If A can be read, the kernel will return the length of A, and that portion of the read will be successful. This would be known as a short read. Once that happens, if you make another call to read and B cannot be read, you will get an EIO
error. That could be a problem with a network file system, a bad block, a file system error, or anything else that prevents the data from being read.
Once the call to read B fails, it will continue to fail because the file offset is not advanced beyond that. If you use pread
to read an unaffected portion, or if you lseek
to an unaffected portion, you'll be able to continue to read until you hit an affected portion.
This is generally the standard Unix behaviour, and would be expected of any POSIX system. The error code on failure might differ in some cases on some systems (for example, the OS might automatically remount the file system read only and return some other error code in that case), but generally one reads all the data that can be validly read, and then if further progress is not possible, one gets an error.
Upvotes: 1