Radu
Radu

Reputation: 1128

FIO latency percentile changes over time

I want to measure and plot the latency percentile change over time for an SSD. If anyone did something similar please share any advice you might have. I am interested in both how to run FIO and how to process the results.

I will describe first the testing methodology I want to use, then describe what I have done so far (and works imperfectly), and finally ask a couple of questions.

Goal:

What I tried so far:

Questions:

Upvotes: 0

Views: 1309

Answers (1)

Anon
Anon

Reputation: 7164

@Radu - you're kind of asking this question on the wrong website (Stack Overflow is more for programming questions). Serverfault or Super User might have been more appropriate. At any rate I'll take a stab (but answers may be low quality because you are asking LOTS of questions at the same time so this is all I have time to answer):

There is a long time required for the FIO startup

When fio starts up, if the file you want to do I/O to doesn't exist (at at least the right size) then fio has to create it. The other thing fio does (if your platform supports it) is invalidate the cache of the file. If you've been queuing up a lot of cached writes that haven't been sent down to your disk it could take time for those to be flushed and for the cache to be dropped. Since I can't see your job file I can't really say more...

Is there a method to use FIO to keep track of latency changes over time. If so, could you please provide an example?

As you've found fio's summary output is cumulative so it's not that useful in your case. However you could just use fio's latency logging to record latency periodically (fio creates an entry for EVERY I/O by default so also see the log_avg_msec option and the Log File Formats section) and do post-processing yourself later (you might even be able to use fiologparser_hist.py).

For sequential writes, how could I increase throughput?

This is a huge topic in itself and I just can't do it justice here. Some starting points for you though: try switching to an asynchronous ioengine like libaio AND increasing the iodepth (e.g. to 32) AND setting direct=1. A bigger block size (e.g. 512k rather 4k) usually helps throughput too (but don't make it too large). Please re-read the help pages/HOWTO even though it's huge because some of the problems you are hitting are described within it (flexible also means complicated in this case...).

Would any of [python scripts in the FIO git repo for plotting ] be useful?

Yes? There are some shell based scripts (like fio2gnuplot) too. http://tfindelkind.com/2015/09/16/fio-flexible-io-tester-part9-fio2gnuplot-to-visualize-the-output/ gives an example. However if you understand the latency file created you may find it is easy to plot them in any spreadsheet or statistics tool of your choosing.

Another hint - try to ensure you are using a recent version of fio (see https://github.com/axboe/fio/releases for versions and it's a fairly easy build once you have the dependencies you need - https://github.com/axboe/fio/blob/fio-3.2/README#L130 ). The online HOWTO being linked is ONLY for the latest version of fio and many bugs are fixed that aren't in the stale versions of fio...

Good luck!

Upvotes: 2

Related Questions