Reputation: 23
I'm a beginner to C++, so please be understanding...
I want to search for a string (needle) within a file (haystack), by reading each line separately, then searching for the needle in that line. However, ideally for a more robust code I would like to be able to just read individual words on the line, so that if there are larger (i.e. multiple) white-space gaps betweeen words they are ignored when searching for the needle. (e.g perhaps using the >> operator??) That is, the needle string should not have to exactly match the size of the space between words in the file.
so for example, if I have a needle:
"The quick brown fox jumps over the lazy dog"
in the file this might be written (on a particular line) as:
... "The quick brown fox jumps over the lazy dog" ...
Is there an efficient way to do this?
Currently I include the necessary number of spaces in my needle string but I would like to improve the code, if possible.
My code currently looks something like the following (within a method in a class):
double var1, var2;
char skip[5];
std::fstream haystack ("filename");
std::string needle = "This is a string, and var1 =";
std::string line;
int pos;
bool found = false;
// Search for needle
while ( !found && getline (haystack,line) ) {
pos = line.find(needle); // find position of needle in current line
if (pos != std::string::npos) { // current line contains needle
std::stringstream lineStream(line);
lineStream.seekg (pos + needle.length());
lineStream >> var1;
lineStream >> skip;
lineStream >> var2;
found = true;
}
}
(Just for clarity, after finding the string (needle) I want to store the next word on that line or in some cases store the next word, then skip a word and store the following word, for example:
With a file:
... ...
... This is a string, and var1 = 111 and 777 ...
... ...
I want to extract var1 = 111; var2 = 777;
)
Thanks in advance for any help!
Upvotes: 2
Views: 5272
Reputation: 53047
This will work, although I think there's a shorter solution:
std::size_t myfind(std::string ins, std::string str) {
for(std::string::iterator it = ins.begin(), mi = str.begin(); it != ins.end(); ++it) {
if(*it == *mi) {
++mi;
if (mi == str.end())
return std::distance(ins.begin(),it);
}
else {
if(*it == ' ')
continue;
mi = str.begin();
}
}
return std::string::npos;
}
// use:
myfind("foo The quick brown fox jumps over the lazy dog bar", "The quick brown fox");
Upvotes: 1
Reputation: 59269
To solve your problem you should strip extra spaces from the needle and the haystack line. std::unique
is defined to do this. Normally it is used after sorting the range, but in this case all we really want to do is remove duplicate spaces.
struct dup_space
{
bool operator()( char lhs, char rhs )
{
return std::isspace( lhs ) && std::isspace( rhs );
}
};
void despacer( const std::string& in, std::string& out )
{
out.reserve( in.size() );
std::unique_copy( in.begin(), in.end(),
std::back_insert_iterator( out ),
dup_space()
);
}
You should use it like this:
void find( const std::string& needle, std::istream haystack )
{
std::string real_needle;
despacer( needle, real_needle );
std::string line;
std::string real_line;
while( haystack.good() )
{
line.clear();
std::getline( haystack, line );
real_line.clear();
despacer( line, real_line );
auto ret = real_line.find( real_needle );
if( ret != std::string::npos )
{
// found it
// do something creative
}
}
}
Upvotes: 0
Reputation: 726589
You can find all sequences of white space characters in the line
string, and replace them with a single white space. This way you would be able to replace multiple spaces in the needle
as well, and the rest of your search algorithm would continue working unchanged.
Here is a way to remove duplicates using STL:
#include <iostream>
#include <algorithm>
#include <string>
#include <iterator>
using namespace std;
struct DupSpaceDetector {
bool wasSpace;
DupSpaceDetector() : wasSpace(0) {}
bool operator()(int c) {
if (c == ' ') {
if (wasSpace) {
return 1;
} else {
wasSpace = 1;
return 0;
}
} else {
wasSpace = 0;
return 0;
}
}
};
int main() {
string source("The quick brown fox jumps over the lazy dog");
string destination;
DupSpaceDetector detector;
remove_copy_if(
source.begin()
, source.end()
, back_inserter(destination)
, detector
);
cerr << destination << endl;
return 0;
}
Upvotes: 1