Reputation: 633
This might have appeared before, but I couldn't understand how to extract formatted data. Below is my code to extract all text between string "[87]" and "[90]" in a text file.
Apparently, the position of [87] and [90] is the same as indicated in the output.
void ExtractWebContent::filterContent(){
string str, str1;
string positionOfCurrency1 = "[87]";
string positionOfCurrency2 = "[90]";
size_t positionOfText1, positionOfText2;
ifstream reading;
reading.open("file_Currency.txt");
while (!reading.eof()){
getline (reading, str);
positionOfText1 = str.find(positionOfCurrency1);
positionOfText2 = str.find(positionOfCurrency2);
cout << "positionOfCurrency1 " << positionOfText1 << endl;
cout << "positionOfCurrency2 " << positionOfText2 << endl;
//str1= str.substr (positionOfText);
cout << "String" << str1 << endl;
}
reading.close();
An Update on the currency file:
[79]More »Brent slips to $102 on worries about euro zone economy
Market Data
* Currencies
CAPTION: Currencies
Name Price Change % Chg
[80]USD/SGD
1.2606 -0.00 -0.13%
USD/SGD [81]USDSGD=X
[82]EUR/SGD
1.5242 0.00 +0.11%
EUR/SGD [83]EURSGD=X
Upvotes: 1
Views: 1223
Reputation: 1238
Boost.Tokenizer can be helpful for parsing out a string, but it gets a little trickier if those delimiters have to be bracketed numbers like you have them. With the delimieters as described, a regex is probably adequate.
Upvotes: 1
Reputation: 1250
All that does is concatenate the output of reading and the strings "[1]" and "[2]". I'm guessing this code resulted from a rather literal extrapolation of similar code using scanf
. scanf
(as well as the rest of C) still works in C++, so if that works for you I would use it.
That said, there are various levels of sophistication at which you can do this. Using regexes is one of the most powerful/flexible ways, but it might be overkill. The quickest way in my opinion is just to do something like:
i1
i2
i1+3
and i2
.In code, supposing std::string line
has the text:
size_t i1 = line.find("[1]");
size_t i2 = line.find("[2]");
std::string out(line.substr(i1+3, i2));
Warning: no error checking.
Upvotes: 0
Reputation: 59811
That really depends on what 'extracting data means'. In simple cases you can just read the file into a string and then use string member functions (especially find
and substr
) to extract the segment you are interested in. If you are interested in data per line getline is the way to go for line extraction. Apply find
and substr
as before to get the segment.
Sometimes a simple find
wont get you far and you will need a regular expression to do easily get to the parts you are interested in.
Often simple parsers evolve and soon outgrow even regular expressions
. This often signals time for the very large hammer of C++ parsing Boost.Spirit.
Upvotes: 2