hyperknot
hyperknot

Reputation: 13976

How to look for an ANSI string in a binary file?

I would like to find the first occurence of an ANSI string in a binary file, using C++.

I know the string class has a handy find function, but I don't know how can I use it if the file is big, say 5-10 MB.

Do I need to copy the whole file into a string in memory first? If yes, how can I be sure that none of the binary characters get corrupted while copying?

Or is there a more efficient way to do it, without the need for copying it into a string?

Upvotes: 1

Views: 2348

Answers (3)

ildjarn
ildjarn

Reputation: 62975

Do I need to copy the whole file into a string in memory first?

No.

Or is there a more efficient way to do it, without the need for copying it into a string?

Of course; open the file with an std::ifstream (be sure to open in binary mode rather than text mode), create a pair of multi_pass iterators (from Boost.Spirit) around the stream, then search for the string with std::search.

Upvotes: 5

TonyK
TonyK

Reputation: 17114

First of all, don't worry about corrupted characters. (But don't forget to open the file in binary mode either!) Now, suppose your search string is n characters long. Then you can search the whole file a block at a time, as long as you make sure to keep the last n-1 characters of each block to prepend to the next block. That way you won't lose matches that occur across block boundaries. So you can use that handy find function without having to read the whole file into memory at once.

Upvotes: 2

lhf
lhf

Reputation: 72312

if you can mmap the file into memory, you can avoid the copy.

Upvotes: 0

Related Questions