Drew Verlee
Drew Verlee

Reputation: 1900

when is a opportune moment to flush the output buffer and some basic c++

I'm reading accelerated c++ and the author writes:

Flushing the output buffers at opportune moments is an important habit when you are writing programs that might take a long time to run. Otherwise, some of the program's output might languish in the systems buffers for a long time between when your program writes it and when you see it

Please correct me if i misunderstand any of these concepts:

There is this explanation I found:

Flushing an output device means that all preceding output operations are required to be completed immediately. This is related to the issue of buffering, which is an optimization technique used by the operating system. Roughly speaking, the operating system reserves (and usually exerts) the right to put the data “on stand by” until it decides that it has an amount of data large enough to justify the cost associated to sending the data to the screen. In some cases, however, we need the guarantee that the output operations performed in our program are completed at a given point in the execution of our program, so we flush the output device.

Continuing from that explanation i read that the three events that cause the system to flush the buffer:

  1. Buffer becomes full and will automatically flush
  2. The library might be asked to read from standard input stream *is standard input stream like std::cin >> name ;
  3. The third occasion is when we explicitly tell it to. How do we explicitly tell it to?

Despite I don't feel like a fully grasp the following:

Upvotes: 2

Views: 2393

Answers (5)

James Kanze
James Kanze

Reputation: 153987

Taking your questions one by one:

  • A buffer, in general, is just a block of memory used to temporarily hold data. When writing to an `std::ofstream`, characters are sent to a `std::filebuf`, which typically, by default, will simply put them into a buffer rather than outputting immediately to the system. When using an `std::ofstream`, there are actually two buffers in play, one in the `ofstream` (within your process), and one in the OS.
  • The standard speaks of the underlying data as a sequence of characters on an external support, with the buffer representing a window into that sequence; outputting data may only update the image in the buffer, and flushing "synchronizes" the image in the buffer with the image of the data the OS has. Which is a reasonably good description if you're outputting to a real file, but doesn't really fit if you're outputting directly to a serial port, or something like that, where the OS doesn't maintain any "image" of the data. Basically, if you've written data to the stream which hasn't been transfered to the OS, flushing the buffer will transfer it to the OS (which means that the `ofstream` can reuse the buffer memory for further buffering). Flushing the buffer typically (i.e. on all of the implementations I know) only synchronizes with the OS (which is all that the standard requires); it doesn't ensure that the data has actually been written to disk. Depending on the application, this may or may not be an issue.
  • The "output device" is anything the system wants it to be. A file, a window on the screen, or in older times or on simpler systems, a printer or a serial port. And the explination you cite is very misleading (or rather isn't talking about `ofstream`), because flushing an `ofstream` doesn't ensure that all preceding output operations are fully finished. All it ensures is that the data in the stream buffer has been transfered to (synchronized with) the OS. In most cases (at least under Windows and Unix), all this means is that the data has been moved from one buffer (in your process) to another (in the OS).
  • The opportune moments will depend a lot on what the application is doing. As a general rule, I'd suggest flushing often, so that if your program crashes, you can see more or less how far it has gotten. (Remember, outputting `std::endl` flushes. For most simple use, just using `std::endl` instead of `'\n'` is sufficient.) There are at least two cases where you will want to think more about flushing, however; if you're outputting a very large amount of data in a block (i.e. without doing much more than formatting between the outputs), excessive flushing can slow the output down considerably. In such cases, you may want to consider using `'\n'` instead of `std::endl`. And the other is for things like logging, where you want the data to appear immediatly, even if the following data will not be output for a while—in this case, you want to be sure that the data has been flushed before continuing.

Data will be explicitly flushed if you call std::ostream::flush() or std::ofstream::close(). (In the latter case, of course, you cannot write more data later.)

Note too that because the data is not actually "written" until it is flushed, most possible errors cannot be detected until then. In particular, something like:

if ( output << data ) {
    //  succeeded...
}

doesn't actually work; the "success" reported by the ofstream is only that it has successfully copied the characters into its buffer (which can hardly fail).

The usual idiom when writing a large block of data, without interruption, is to just write it, without flushing, then close the file and check for errors then. This is not appropriate when writing with interruptions if you want the data to appear immediately, and it has the disadvantage that if your program crashes, some of the data you've "written" will have disappeared, which can make debugging harder.

Upvotes: 0

user1084944
user1084944

Reputation:

To flush an std::ostream, you use the std::flush manipulator. i.e.

std::cout << std::flush;

Note that std::endl already flushes the stream. So if you are in the habit of ending your insertions with it, you don't need to do anything additional. Note that this means if you are seeing poor performance because you flush too much, you need to switch from inserting std::endl to inserting a newline: '\n'.

A stream is a sequence of characters (i.e. things of type char). An output stream is one you write characters to. Typical applications are writing data to files, printing text on screen, or storing them in a std::string.

Streams often have the feature that writing 1024 characters at once is an order of magnitude (or more!) faster than writing 1 character at a time 1024 times. One of the main purposes of the notion of 'buffering' is to deal with this in a convenient fashion. Rather than writing directly to whatever you actually want the characters to go, you instead write to the buffer. Then, when you're ready, you "flush" the buffer: you move the characters from the buffer to the place where you want them. Or, if you don't care about the precise details, you use a buffer that flush itself automatically. e.g. the buffer used in an std::ofstream is typically fixed size, and will flush whenever its full.

When is it an opportune time to flush, you ask? I say you're optimizing prematurely. :) Rather than looking for the perfect moments to flush, just do it often. Put in enough flushes so that flush frequently enough that you'll never find yourself in a situation where, e.g., you want to look at the data in a file but it's sitting unwritten in a buffer. Then if it really does turn out there are too many flushes hurting performance, that's when you spend time looking into it.

Upvotes: 1

Jerry Coffin
Jerry Coffin

Reputation: 490408

You explicitly flush a stream with your_stream.flush();.

What a output buffer is vs just a buffer and presumable other types of buffers...

A buffer is usually a block of memory used to hold data waiting for processing. One typical use is data that's just been read from a stream, or data waiting to be written to disk. Either way, it's generally more efficient to read/write large blocks of data at a time, so read/write an entire buffer at a time, but the client code can read/write in whatever amount is convenient (e.g., one character or one line at a time).

What it means to flush a buffer. Does it simply mean to clear the ram?

That depends. For an input buffer, yes, it typically means just clearing the contents of the buffer, discarding any data that's been read into the buffer (though it doesn't usually clear the RAM -- it just sets its internal book-keeping to say the buffer is empty).

For an output buffer, flushing the buffer normally means forcing whatever data is in the buffer to be written to the associated stream immediately.

What is the "output device" refereed to in the above explanation

When you're writing data, it's whatever device you're ultimately writing to. That could be a file on the disk, the screen, etc.

And finally after all this when are opportune moments to to flush your buffer...ugh that doesn't sound pleasant.

One obvious opportune moment is right when you finish writing data for a while, and you're going to go back to processing (or whatever) that doesn't produce any output (at least to the same destination) for a while. You don't want to flush the buffer if you're likely to produce more data going the same place right afterward -- but you also don't want to leave the data in the buffer when there's going to be a noticeable delay before you fill the buffer (or whatever) so the data will get written to its destination.

Upvotes: 0

Zaid Amir
Zaid Amir

Reputation: 4785

I think the author means stream buffers. An opportune moment to flush a buffer is really dependent on what your code does, how its constructed and how the buffer is allocated and probably the scope its initialized in.

For stream and output buffers take a look at this.

Yes a standard input stream means using the >> operator. (Mostly)

you can explicitly tell a stream buffer to flush by calling for example ofstream::flush of course other types of buffers have their own explicit flushing methods and some might require a manual implementation.

Upvotes: 0

Some programmer dude
Some programmer dude

Reputation: 409364

This depends very much on the type of application, but one rule of thumb is to flush after you written one record. For text that is usually after every line, for binary data after every object. If the performance seems to be to slow, then flush every X record you write, and experiment with the X until you find a number when you are happy with the performance and while X is not big enough so you loose too much data in case of a crash.

Upvotes: 0

Related Questions