Mike Trader
Mike Trader

Reputation: 8704

C++ Parsing a line out of a large file

I have read an entire file into a string from a memory mapped file Win API

CreateFile( "WarandPeace.txt", GENERIC_READ, FILE_SHARE_READ, 0, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, 0 )

etc...

Each line is terminated with a CRLF. I need to find something on a line like "Spam" in the line "I love Spam and Eggs" (and return the entire line (without the CRLF) in a string (or a pointer to the location in the string) The original string cannot be altered.

EDITED:

Something like this:

string ParseStr( string sIn, string sDelim, int nField )
{  
    int match, LenStr, LenDelim, ePos, sPos(0), count(0);
    string sRet;

        LenDelim = sDelim.length();
        LenStr   = sIn.length();
        if( LenStr < 1 || LenDelim < 1 ) return ""; // Empty String
        if( nField < 1 ) return "";
        //=========== cout << "LenDelim=" << LenDelim << ", sIn.length=" << sIn.length() << endl;


        for( ePos=0; ePos < LenStr; ePos++ ) // iterate through the string
        { // cout << "sPos=" << sPos << ", LenStr=" << LenStr << ", ePos=" << ePos << ", sIn[ePos]=" << sIn[ePos] << endl;
            match = 1; // default = match found
            for( int k=0; k < LenDelim; k++ ) // Byte value 
            {  
                if( ePos+k > LenStr ) // end of the string
                    break;
                else if( sIn[ePos+k] != sDelim[k] ){ // match failed
                    match = 0; break; }
            }
            //===========

            if( match || (ePos == LenStr-1) )  // process line
            { 
                if( !match ) ePos = LenStr + LenDelim; // (ePos == LenStr-1) 
                count++; // cout << "sPos=" << sPos << ", ePos=" << ePos << " >" << sIn.substr(sPos, ePos-sPos) << endl;
                if( count == nField ){ sRet = sIn.substr(sPos, ePos-sPos); break; } 
                ePos = ePos+LenDelim-1; // jump over Delim
                sPos = ePos+1; // Begin after Delim
            } // cout << "Final ePos=" << ePos << ", count=" << count << ", LenStr=" << LenStr << endl;
        }// next

    return sRet;      
} 

If you like it, vote it up. If not, let's see what you got.

Upvotes: 0

Views: 434

Answers (3)

Igor
Igor

Reputation: 27250

Do you really have to do it in C++? Perhaps you could use a language which is more appropriate for text processing, like Perl, and apply a regular expression.

Anyway, if doing it in C++, a loop over Prev_delim_position = sIn.find(sDelim, Prev_delim_position) looks like a fine way to do it.

Upvotes: 0

chollida
chollida

Reputation: 7894

If you are trying to match a more complex pattern then you can always fall back to boost's regex lib.

See: http://www.boost.org/doc/libs/1_41_0/libs/regex/doc/html/index.html

#include <iostream>
#include <string>
#include <boost/regex.hpp>

using namespace std;

int main( ) 
{
   std::string s;
   std::string sre("Spam");
   boost::regex re;

   ifstream in("main.cpp");
   if (!in.is_open()) return 1;

   string line;
   while (getline(in,line))
   {
      try
      {
        // Set up the regular expression for case-insensitivity
        re.assign(sre, boost::regex_constants::icase);
      }
      catch (boost::regex_error& e)
      {
        cout << sre << " is not a valid regular expression: \""
          << e.what() << "\"" << endl;
         continue;
      }
      if (boost::regex_match(line, re))
      {
         cout << re << " matches " << line << endl;
      }
    }
}

Upvotes: 2

pm100
pm100

Reputation: 50180

system("grep ....");

Upvotes: -1

Related Questions