Reputation: 8001
How to extract parts from regex in C++?
For example I have patterns like this:
new line means "followed by"
delimiter string,
name,
':' character,
list of Xs, where X is name; (string followed by ';' character)
I can use regex for matching, but is there a way to not only match, but also extract parts from the pattern? For example:
$DatasetName: A; B; C;
is a given string, and I would like to extract the dataset name, and then the column names A, B, and C.
Upvotes: 0
Views: 131
Reputation: 11944
Well, as already suggested you could do by hand parsing similar to this (it is only for demonstration purposes and does not claim to be perfect):
#include <iostream>
#include <vector>
#include <string>
bool parse_by_hand(const std::string& phrase)
{
enum parse_state
{
parse_name,
parse_value,
};
std::string name, current_value;
std::vector<std::string> values;
parse_state state = parse_name;
for(std::string::const_iterator iterator = phrase.begin(); iterator != phrase.end(); iterator++)
{
switch(state)
{
case parse_name:
if(*iterator != ':')
name += *iterator;
else
state = parse_value;
break;
case parse_value:
if(*iterator != ';')
current_value += *iterator;
else
{
state = parse_value;
values.push_back(current_value);
current_value.clear();
}
break;
default:
return false;
}
}
// Error checking here, name parsed? values parsed?
return true;
}
int main(int argc, char** argv)
{
std::string phrase("$DatasetName: A; B; C;");
parse_by_hand(phrase);
}
As for the std::regex
, my first shot was for something like this ([^:]*):(([^;]*);)*
but unless I'm not mistaken (and I hope someone corrects me if I am), the recursive capture group will give you the last matched value not all values so you would still have to do multiple iterations with regex_search
which takes away the ease of 'one-liner-regex-matching' off the table. Alternatively if std::regex
is not a must and you can use Boost, take a look at Repeated captures, this should solve the capture group issue.
Upvotes: 1