Nulik
Nulik

Reputation: 7360

minimum write size with O_DIRECT

I am writing a custom database engine for Linux 2.6.X kernels in C and I need to know what is the minimum write size of a write() system call for a file opened with O_DIRECT flag? In the docs it says that since linux 2.6 kernel versions you can use 512 byte block. But, what if my hard disk uses 8K blocks? Will it return EINVAL error on write in this case? I heard disks with 512 byte sector are becoming obsolete and the new disks use 8K sector, so I need to be sure my app doesn't crash when the user tries it on such disk. In case of it is possible to use 512 byte writes on a disk with 8k sectors, what happens when I write , say 2 blocks of 512 bytes , does the linux kernel reads the 8k sector from disk, replaces the 1k block I told it to write and then writes back to disk the 8k sector? This would be real slow!

Also, there is another question I have regarding this issue, does the minimum write size varies if I use raw device or a ext3 filesystem when opening file?

Upvotes: 5

Views: 2782

Answers (3)

Iozone
Iozone

Reputation: 1

It's way more complicated than folks might think.

  1. Buffer should be page boundary aligned.
  2. Offset for the access must be sector size aligned.
  3. Transfer size must be a multiple of sector size.
  4. There is NO API for asking from POSIX system calls, what is the sector size of a file system. Further more, the filesystem may have an array of disks, where each may have a different sector size. 512, or 4096 ...
  5. If one tries a read() for 512 bytes it could fail, if the drive is a 4k sector drive, or it could succeed if the drive supports 512e (emulated 512 byte sectors) However, the emulation may have an impact on the performance, and produce significantly lower performance when using 512e.
  6. There is again no standard POSIX system call for asking if the drive, or drives are using 512e.

Yes, it's very complicated indeed, and without asking the owner of the hardware what is the optimal sector size to use, across all of the drives that are involved in the filesystem, there is no way to predict the behavior, in both success or failure, or the performance aspects.

Upvotes: 0

bbqueue
bbqueue

Reputation: 1

Slightly OT, "I heard disks with 512 byte sector are becoming obsolete and the new disks use 8K sector," - are there any 8K sectors disks out there. I believe the newer disks use 4K sector sizes also known as advanced format disks. 8K sectors disks are being thought about for the future but i doubt if any manufacturer has come out with them yet

Regarding your query, i think its the sector size of disk. So if you have a 4K disk, you need to issue a read/write with size 4K. In the case of O_DIRECT the read/write is passed directly to the disk and a disk can read/write with granularity == sector size (logical block size reported by the disk)

Upvotes: -1

Giuseppe Guerrini
Giuseppe Guerrini

Reputation: 4426

Unfortunately there is no general way to know the constraints of O_DIRECT. This manual page seems to kill any hope:

http://www.kernel.org/doc/man-pages/online/pages/man2/open.2.html

Also, I am quite sure that the block size may change depending on the underlaying filesystem

Upvotes: 2

Related Questions