ustcyue
ustcyue

Reputation: 651

why is dd with direct flag much slower than dsync

I was trying to use dd to test the performance of my ceph filesystem. During testing, I found something confusing, that is, dd with oflag=dsync or conv=fdatasync/fsync is around 10 times faster than dd with oflag=direct.

My network is 2*10Gb

/mnt/testceph# dd if=/dev/zero of=/mnt/testceph/test1  bs=1G count=1 oflag=direct
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 23.1742 s, 46.3 MB/s


/mnt/testceph# dd if=/dev/zero of=/mnt/testceph/test1  bs=1G count=1 conv=fdatasync
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 2.22468 s, 483 MB/s

Upvotes: 3

Views: 10979

Answers (1)

Anon
Anon

Reputation: 7124

dd with oflag=dsync or conv=fdatasync/fsync is around 10 times faster than dd with oflag=direct

conv=fdatasync / conv=fsync still mean I/O is initially queued to the kernel cache and destaged to disk as the kernel sees fit. This gives the kernel a big opportunity to merge I/Os, create parallel submission out of I/O that has yet to be destaged and generally decouples I/O submission to the kernel from I/O acceptance by the disk (to the extent that buffering will allow). Only when dd has finished sending ALL the data will it have to wait for anything still only in cache to be flushed to disk (and with fsync that includes any metadata).

oflag=dsync is still allowed to make use of kernel buffering - it just causes a flush+wait for completion after each submission. Since you are sending only one giant write this will put you into near enough the same scenario as doing conv=fdatasync above.

When you specify oflag=odirect you are saying "trust that all my parameters are sensible and turn off as much kernel buffering as you can". In your case a bs that huge is nonsensical with odirect as your "disk"'s maximum transfer block size (let alone the optimal size) is almost certainly smaller. You'll likely trigger splitting but due to memory requirements on O_DIRECT the splitting points may lead to smaller I/Os than the above cases.

It's hard to tell for sure what's going on though. Really we would need to see how the I/O was leaving the bottom of the kernel (e.g. by comparing iostat output during runs) to get a better idea of what's going on.

TLDR; maybe using odirect is leading to smaller sized I/Os leaving the kernel and thus causing worse performance in your scenario?

Upvotes: 11

Related Questions