PaulH
PaulH

Reputation: 7863

inserting a std::string to an arbitrary location within a std::fstream

I have a Visual Studio 2008 C++ application where I would like to insert a string to an arbitrary point in a file using std::fstream. The file may be as large as 100MB in size, so I don't want to read it entirely in to memory, modify it, and the re-write a new file.

/// Insert some data in to a file at a given offset
/// @param file stream to insert the data
/// @param data string to insert
/// @param offset location within the file to insert the data
void InsertString( std::fstream& file, const std::string& data, size_t offset );

The method I'm considering now is to read the file in reverse moving each byte from the end out by the length of the data string, then inserting the new string.

What is the most efficient way of accomplishing this?

Upvotes: 1

Views: 603

Answers (3)

Jerry Coffin
Jerry Coffin

Reputation: 490218

You've just stated one of the basic motivations for database formats, and a need they fulfill.

Based on that, the solution seems pretty obvious, at least to me: you need to use a database format of some sort, probably along with code that directly supports that format. Nearly any decent db format will support what you've said you need, so it's mostly a matter of deciding which code base provides an interface you like.

Of course, if you need to produce (for example) a normal text file as the result, then this isn't really a solution. For a case like this, you pretty much need to bite the bullet and live with copying a lot of data around. At least in my experience, OSes are sufficiently oriented toward reading files sequentially, that unless your modification is quite close to the end of the file, you may easily find it's more efficient to read and write the whole file rather than copying just enough to make space for the new data.

Upvotes: 3

rerun
rerun

Reputation: 25505

You can use Seekp to move the file pointer to the desired potions. But you will need to know the file size using something like GetFileSize(). Either way you will need to read all of the data after the insertion point to write it to the new file. I would just read a block and write a block if memory consumption is the main or use a memory mapped file if performance is the main issue and allow the os handle the buffering.

Upvotes: 0

Mark B
Mark B

Reputation: 96261

Unless this is an extremely rare operation, just don't. Strongly reconsider your file format so you don't have to insert strings in the middle because as you suspect you have to shift data down and in large files that's not going to be horribly efficient if you're doing it a lot.

If this is really a rare occurrence, then I'd say just read the old file up to the insertion point, writing a new file as you go, write the new string, and then finish read/writing from the old file. Finally, remove the old file and rename the new one.

Upvotes: 2

Related Questions