Reputation: 67
I have to save some graph data(array of structs) into text file. I made working program using fprintf but for extra points I need to be faster. I have spend couple hours googling if there is anything faster and try to use fwrite (but I wasn't able to fwrite as a text) I cannot really find any other functions etc.
This is my write function using fprintf:
void save_txt(const graph_t * const graph, const char *fname)
{
int count = graph->num_edges, i = 0;
FILE *f = fopen(fname, "w");
while (count > 0) {
int r = fprintf(f, "%d %d %d\n", (graph->edges[i].from), (graph->edges[i].to), (graph->edges[i].cost));
i++;
if (r >= 6) {
count -= 1;
} else {
break;
}
}
if (f) {
fclose(f);
}
}
Upvotes: 4
Views: 4590
Reputation: 57398
I would try setting a write buffer on the stream, and experimenting with different sizes of buffer (e.g. 1K, 2K, 4K, 8K and so on). Notice that by default your file is already using a buffer of BUFSIZ value, and it might be already enough.
#define BUFFERSIZE 0x1000
void save_txt(const graph_t * const graph, const char *fname)
{
int count = graph->num_edges, i = 0;
unsigned char buf[BUFFERSIZE];
FILE *f = fopen(fname, "w");
setvbuf(f, buf, _IOFBF, BUFFERSIZE);
...
The output file f
is born with the default BUFSIZ cache, so it might benefit from a larger fully buffered write cache.
Of course this assumes that you're writing to a relatively slow medium and that the time spent saving is relevant; otherwise, whatever is slowing you down is not here, and therefore increasing save performances won't help you appreciably.
There are instrumentations like prof
and gprof
that can help you determine where your program is spending the most time.
One, much more awkward, possibility is merging Kiwi's answer with a buffered write call to avoid the code in printf that verifies which format to use, since you already know this, and to use as few I/O calls as possible (even just one if BUFFERSIZE is larger than your destination file's length).
// These variables must now be global, declared outside save_txt.
char kiwiBuf[BUFFERSIZE];
size_t kiwiPtr = 0;
FILE *f;
void my_putchar(char c) {
kiwiBuf[kiwiPtr++] = c;
// Is the buffer full?
if (kiwiPtr == BUFFERSIZE) {
// Yes, empty the buffer into the file.
flushBuffer();
}
}
void flushBuffer() {
if (kiwiPtr) {
fwrite(kiwiBuf, kiwiPtr, 1, f);
kiwiPtr = 0;
}
}
You need now to flush the buffer before close:
void save_txt(const graph_t * const graph, const char *fname)
{
int i, count = graph->num_edges;
f = fopen(fname, "w");
if (NULL == f) {
fprintf(stderr, "Error opening %s\n", fname);
exit(-1);
}
for (i = 0; i < count; i++) {
my_put_nbr(graph->edges[i].from);
my_putchar(' ');
my_put_nbr(graph->edges[i].to);
my_putchar(' ');
my_put_nbr(graph->edges[i].cost);
my_putchar('\n');
}
flushBuffer();
fclose(f);
}
UPDATE
By declaring the my_putchar
function as inline
and with a 4K buffer, the above code (modified with a mock of graph reading from an array of random integers) is around 6x faster than fprintf
on
Linux mintaka 4.12.8-1-default #1 SMP PREEMPT Thu Aug 17 05:30:12 UTC 2017 (4d7933a) x86_64 x86_64 x86_64 GNU/Linux
gcc version 7.1.1 20170629 [gcc-7-branch revision 249772] (SUSE Linux)
About 2x of that seems to come from buffering. Andrew Henle made me notice an error in my code: I was comparing results to a baseline of unbuffered output, but fopen
uses by default a BUFSIZ value and on my system BUFSIZ is 8192. So basically I've "discovered" just that:
printf
's checks and conversions.Also, the overall increase (google Amdahl's Law) depends on what fraction of processing time goes into saving. Clearly if one hour of elaboration requires one second of saving, doubling saving speed saves you half a second; while increasing elaboration speed by 1% saves you 36 seconds, or 72 times more.
My own sample code was designed to be completely save-oriented with very large graphs; in this situation, any small improvement in writing speed reaps potentially huge rewards, which might be unrealistic in the real-world case.
Also (in answer to a comment), while using a small enough buffer will slow saving, it is not at all certain that using a larger buffer will benefit. Say that the whole graph generates in its entirety 1.2Kb of output; then of course any buffer value above 1.2Kb will yield no improvements. Actually, allocating more memory might negatively impact performances.
Upvotes: 5
Reputation: 67
I had to be 1.3x faster than fprintf, here is code that worked for me. I have to say that I had to submit it multiple times, sometimes I passed only 1 out of 5 tests with the same code. In conclusion, it is faster than fprintf but not reliably 1.3times faster..
void save_txt(const graph_t * const graph, const char *fname)
{
int count = graph->num_edges, i = 0;
char c = '\n';
char d = ' ';
char buffer[15];
FILE *f = fopen(fname, "w");
while (count > 0) {
itoa(graph->edges[i].from,buffer,10);
fputs(buffer, f);
putc(d, f);
itoa(graph->edges[i].to,buffer,10);
fputs(buffer, f);
putc(d, f);
itoa(graph->edges[i].cost,buffer,10);
fputs(buffer, f);
putc(c, f);
i++;
count -= 1;
}
if (f) {
fclose(f);
}
}
Upvotes: 1
Reputation: 21
I would write a small function say print_graph(int int int) and call write directly in it
or something like this with my_putchar being a write call
int my_put_nbr(int nb)
{
if (nb < 0)
{
my_putchar('-');
nb = -nb;
}
if (nb <= 9)
my_putchar(nb + 48);
else
{
my_put_nbr(nb / 10);
my_put_nbr(nb % 10);
}
return (0);
}
Upvotes: 2