Reputation: 113
I have a program that reads from a file using O_DIRECT. The file is being continuously written to by another process. The read loop works fine until it reaches the point where the write is happening. At this point, read() fails with EINVAL (Invalid argument). If I don't use O_DIRECT, this issue doesn't happen and read returns valid value. I have verified that block size is 4096.
Here is my code:
#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdlib.h>
#include <errno.h>
#include <string.h>
#include <malloc.h>
#include <iostream>
#define BUFFER_SIZE 524288 // 512 KB
int main() {
int fd;
char *buffer;
ssize_t bytesRead;
// Open file with O_DIRECT
fd = open("/data/file.lz4", O_RDONLY | O_DIRECT);
if (fd == -1) {
perror("Error opening file");
return EXIT_FAILURE;
}
// Allocate aligned memory for O_DIRECT
buffer = (char *) aligned_alloc(4096, BUFFER_SIZE);
// Read in a loop
while (1) {
bytesRead = read(fd, buffer, BUFFER_SIZE);
if (bytesRead == -1) {
std::cerr << "Error reading file: " << strerror(errno) << std::endl;
std::cerr << "fd: " << fd << " buffer: " << (void*)buffer << " BUFFER_SIZE: " << BUFFER_SIZE << std::endl;
break;
}
std::cout << bytesRead << std::endl;
}
// Free the aligned buffer
free(buffer);
// Close file
close(fd);
return 0;
}
Observed Behavior The program reads correctly when the file has data. When it reaches the writing point, the output is:
524288
524288
524288
524288
524288
172020
0
0
Error reading file: Invalid argument
fd: 3 buffer: 0x7f4140e88000 BUFFER_SIZE: 524288
This means read() is failing when trying to read data that is being written at the same time.
What I've Checked:
The buffer is correctly aligned using aligned_alloc(4096, BUFFER_SIZE). BUFFER_SIZE is a multiple of 4096, ensuring it meets O_DIRECT alignment requirements. fd is valid before the error occurs.
The file exists and is being continuously updated by another process.
Details
filesystem: ext4 (local SSD) cpu model: Intel(R) Core(TM) i9-14900KS Linux: 5.15.77-1-lts
Questions
Why does read() return EINVAL when it reaches the writing point?
Is there a way to wait for more data instead of failing when read() reaches unwritten portions?
Any insights into how O_DIRECT interacts with concurrent writes would be greatly appreciated!
Upvotes: 4
Views: 60
Reputation: 123
As suggested in Alan's comment:
reads have to always start from an offset which is a multiple of the block size, after the read that didn't return a multiple of block size you'll need to seek back to the start of a block
Upvotes: 1