Reputation: 13976
I would like to find the first occurence of an ANSI string in a binary file, using C++.
I know the string class has a handy find function, but I don't know how can I use it if the file is big, say 5-10 MB.
Do I need to copy the whole file into a string in memory first? If yes, how can I be sure that none of the binary characters get corrupted while copying?
Or is there a more efficient way to do it, without the need for copying it into a string?
Upvotes: 1
Views: 2348
Reputation: 62975
Do I need to copy the whole file into a string in memory first?
No.
Or is there a more efficient way to do it, without the need for copying it into a string?
Of course; open the file with an std::ifstream
(be sure to open in binary mode rather than text mode), create a pair of multi_pass
iterators (from Boost.Spirit) around the stream, then search for the string with std::search
.
Upvotes: 5
Reputation: 17114
First of all, don't worry about corrupted characters. (But don't forget to open the file in binary mode either!) Now, suppose your search string is n
characters long. Then you can search the whole file a block at a time, as long as you make sure to keep the last n-1
characters of each block to prepend to the next block. That way you won't lose matches that occur across block boundaries. So you can use that handy find function without having to read the whole file into memory at once.
Upvotes: 2