Steve Lau
Steve Lau

Reputation: 1003

Direct I/O alignment restrictions on Btrfs

When performing direct I/O, the buffer memory boundary, file or device offset, and length of the data to be transferred have to be disk blocksize(typically 512 bytes) aligned, or you will get an error with errno set to EINVAL.

The Linux Programming Interface provides a code snippet to verify this:

/*
   Usage: direct_read file length [offset [alignment]]
*/
#define _GNU_SOURCE     /* Obtain O_DIRECT definition from <fcntl.h> */
#include <fcntl.h>
#include <malloc.h>
#include "tlpi_hdr.h"

int
main(int argc, char *argv[])
{
    int fd;
    ssize_t numRead;
    size_t length, alignment;
    off_t offset;
    char *buf;
    if (argc < 3 || strcmp(argv[1], "--help") == 0)
        usageErr("%s file length [offset [alignment]]\n", argv[0]);
    length = getLong(argv[2], GN_ANY_BASE, "length");
    offset = (argc > 3) ? getLong(argv[3], GN_ANY_BASE, "offset") : 0;
    alignment = (argc > 4) ? getLong(argv[4], GN_ANY_BASE, "alignment") : 4096;


    fd = open(argv[1], O_RDONLY | O_DIRECT);
    if (fd == -1)
        errExit("open");

    buf = memalign(alignment * 2, length + alignment);
    if (buf == NULL)
        errExit("memalign");

    buf += alignment;

    if (lseek(fd, offset, SEEK_SET) == -1)
        errExit("lseek");

    numRead = read(fd, buf, length);
    if (numRead == -1)
        errExit("read");
    printf("Read %ld bytes\n", (long) numRead);

    exit(EXIT_SUCCESS);
}

getLong is a number parsing function (call strtol(3) under the hood), and errExit will print a human-readable error message to stderr according the current value of errno, then terminate the process (can be seen as something like perror();exit(1) ). If you are curious about their implementations:

Then I deliberately violate the alignment restriction to see the EINVAL result. However, I got a successful result:

# read 256 bytes from source with offset and alignment set to 0 and 4096 respectively
# 256 is not a multiple of 512 bytes
# source is a regular file whose size is bigger than 256 bytes
$ ./a.out source 256
Read 256 bytes

# Environemt
$ uname -a
Linux fedora 5.19.8-200.fc36.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Sep 8 19:02:21 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
$ stat -f .
  File: "."
    ID: e00375ccffa20181 Namelen: 255     Type: btrfs
Block size: 4096       Fundamental block size: 4096
Blocks: Total: 121680640  Free: 109268275  Available: 109118327
Inodes: Total: 0          Free: 0

The result I should get is something like:

$ ./a.out source 256
ERROR [EINVAL Invalid argument] read

Then I suspect this is a problem with Btrfs, so I test this on another machine whose file sytem is Ext4:

$ uname -a
Linux pop-os 5.19.0-76051900-generic #202207312230~1660780566~22.04~9d60db1 SMP PREEMPT_DYNAMIC Thu A x86_64 x86_64 x86_64 GNU/Linux
$ stat -f .
File: "."
    ID: 4885eb446c106708 Namelen: 255     Type: ext2/ext3
Block size: 4096       Fundamental block size: 4096
Blocks: Total: 26819732   Free: 9989622    Available: 8615696
Inodes: Total: 6856704    Free: 5787763


$ ./a.out source 256
ERROR [EINVAL Invalid argument] read

As you can see, I got the EINVAL error.

So my question is why does the alignment restriction not work on Btrfs?

Upvotes: 0

Views: 287

Answers (0)

Related Questions