Reputation: 5083
I am developing an application which basically stores 2D matrices in memory and performs mathematical operations on them. When I benchmarked my software I found that file reading and file saving were performing very badly. So I multi threaded file reading and this resulted in tremendous boost in performance. The reason for boost in this performance may not be due to I/O but rather due to the translation of string
data from file into double
being distributed among threads.
Now I want to improve my file saving performance. Simply speaking it is not possible to multi thread saving data to a single file. So what if the data is broken up into different files (= number of cores)? Is this the correct way to solve this problem? Also how do I make all these files look as a single file in Windows Explorer so as to hide this complexity from the user?
Upvotes: 1
Views: 561
Reputation: 1
To summarize my comments:
gcc -O2
), a tiny C program looping a million times for (long i=0; i<cnt; i++) printf("%.15f\n", log(1.0+i*sqrt((double)i)));
takes (when redirecting stdout to /tmp/my.out
) 0.79s user 0.02s system 99% cpu 0.810 total
.... The output contains a million numbers, totalizing 18999214 bytes.... ; so you might blame your file system, operating system, or library (perhaps using C <stdio.h>
functions like printf
might be a bit faster than C++ operator <<
....)..dll
plugin for Excel to read it; but I don't think it is worth your effort.Notice that I updated my sample code to output 15 digits per double-precision number!
BTW, I suggest you to make your sequalator software a free software, e.g. to publish its source code (e.g. under GPLv3+ license) on some repository like github.... You probably could get more help if you published your source code (under a friendly free software license).
You might consider switching to Linux (and use a recent GCC, e.g. 4.8); it probably is faster for such applications (but then, I agree that using Excel could be an issue; they are several free software altenatives, e.g. gnumeric; also scilab could interest you ...)
BTW, nothing in my answer above refers to multi-threading, because you cannot easily multi-thread the output of some textual file....
Upvotes: 2
Reputation: 249462
The reason for boost in this performance may not be due to I/O but rather due to the translation of string data from file into double being distributed among threads.
If this is the case, consider storing binary data instead of text. Given that you are dealing with 2D matrixes, a useful format might be HDF5. You can then read and write at full speed, and it supports compression too if you need that for even more disk space savings. I doubt you'll need threads at all if you do this.
Upvotes: 1