fduff
fduff

Reputation: 3821

std::regex search and replace

I need a hand on getting the regex expression to work correctly with the following source string:

<path d="M 1434.9,982.0 L 1461.3,982.0  L 1461.3,1020.5  L 1434.9,1020.5 z " stroke-width="1" stroke="#008000" fill="none"/>

On such line, I need to adjust the stroke-width and stroke values without impacting the rest of the content.

So far, I'm doing this in 2 steps, first replacing the stroke value and then replacing the stroke-width value, this is where I get strange results, see below.

Snippet:

string s("<path d=\"M 1434.9,982.0 L 1461.3,982.0  L 1461.3,1020.5  L 1434.9,1020.5 z \" stroke-width=\"1\" stroke=\"#008000\" fill=\"none\"/>");                   
std::regex re("stroke=\".+\" ");
cout << "0. " << s << endl;
s = std::regex_replace(s, re, "stroke=\"#00FF00\" ");
cout << "1. " << s << endl;
re = "stroke-width=\".+\" .*?";
s = std::regex_replace(s, re, "stroke-width=\"3\" ");
cout << "2. " << s << endl;

Output:

0.     <path d="M 1434.9,982.0 L 1461.3,982.0  L 1461.3,1020.5  L 1434.9,1020.5 z " stroke-width="1" stroke="#008000" fill="none"/>
1.     <path d="M 1434.9,982.0 L 1461.3,982.0  L 1461.3,1020.5  L 1434.9,1020.5 z " stroke-width="1" stroke="#00FF00" fill="none"/>
2.     <path d="M 1434.9,982.0 L 1461.3,982.0  L 1461.3,1020.5  L 1434.9,1020.5 z " stroke-width="3" fill="none"/>

It's almost what I'm looking for except that in the 2. string output, the stroke field is gone!

I'm currently using the std::regex, but I'm open to boost::regex too. Appreciate any pointers on this.

Upvotes: 0

Views: 121

Answers (3)

Saleem
Saleem

Reputation: 8978

You can replace both values in one regex.

^(.*stroke-width=)(.*?)(\s.*stroke=["'])(.*?)(["'].*)$

Example:

std::string text = R"(<path d="M 1434.9,982.0 L 1461.3,982.0  L 1461.3,1020.5  L 1434.9,1020.5 z " stroke-width="1" stroke="#008000" fill="none"/>)";
std::string result;

char buff[100];
snprintf(buff, sizeof(buff), "$1\"%s\"$3%s$5", "5","#000000");
std::string replacement_text = buff;

std::regex re(R"(^(.*stroke-width=)(.*?)(\s.*stroke=["'])(.*?)(["'].*)$)",
           std::regex_constants::icase);

result = std::regex_replace(text, re, replacement_text);

cout << result << endl;

Code will emit:

<path d="M 1434.9,982.0 L 1461.3,982.0  L 1461.3,1020.5  L 1434.9,1020.5 z " stroke-width="5" stroke="#000000" fill="none"/>

Upvotes: 0

fduff
fduff

Reputation: 3821

I've just tried another way, making the regex less greedy, which works in this case.

// changing the 1st regex to
regex re("stroke=\".+?\" ");

// and the 2nd to
re = "stroke-width=\".+?\" ";

This time gives the right output:

0.     <path d="M 1434.9,982.0 L 1461.3,982.0  L 1461.3,1020.5  L 1434.9,1020.5 z " stroke-width="1" stroke="#008000" fill="none"/>
1.     <path d="M 1434.9,982.0 L 1461.3,982.0  L 1461.3,1020.5  L 1434.9,1020.5 z " stroke-width="1" stroke="#00FF00" fill="none"/>
2.     <path d="M 1434.9,982.0 L 1461.3,982.0  L 1461.3,1020.5  L 1434.9,1020.5 z " stroke-width="3" stroke="#00FF00" fill="none"/>

Upvotes: 0

Pelle Nilsson
Pelle Nilsson

Reputation: 1010

The .+ will match as many characters as it can, so it will consume the closing quotation mark and beyond if there are more quotation marks later in the string. Use the non-greedy version .+? instead.

Also, the trailing .*? in the last pattern won't match anything and can be removed.

Upvotes: 1

Related Questions