Improve throughput writing a lot of small files in C

Question

I want to improve the throughput of a software that writes, several usually small, files into a network attached volume.

The volume is limited to 100 IOPS and 80 MB/s of bandwidth.

At the moment I saturate the 100 IOPS but the bandwidth is very very far from the 80 MB/s reachable, ~4 MB/s but even less.

I believe that the main issue is that we make a lot of small requests, those small requests saturated the IOPS but the bandwidth is pretty much left unexploited.

The software is written in C and I control pretty much everything down to the actual write syscall.

At the moment the architecture is multithreading, with several threads working as "spoolers" and making synchronous write call, each for a different file.

So suppose we have file a, b and c and thread t1, t2 and t3.

t1 will open a and call in a loop something like write(fd_a, buff_a, 1024) and the same will do t2 (write(fd_b, buff_b, 1024)) and t3 (write(fd_c, buff_c, 1024)).

Each file is a new file, so it get created at the first write.

I believe that the problem is that the requests the OS is making (after the Linux IO scheduler merge) are pretty small, in the order of 10/20 blocks (5/10 kilobyte) each.

The only way I see to fix the issue is to make bigger requests, but each file is small so I am not quite sure what is the best way forward.

A possible idea could be to make a single write request instead of a loop of several request, so lookup how big is the file, allocate enough memory, populate the buffer and finally execute a single write.

Another idea could be to switch so async io, but I don't have understand what the advantages would be in this case.

Do you have any other suggestion?

Improve throughput writing a lot of small files in C

Answers (1)

Related Questions