Jason Stott
Jason Stott

Reputation: 11

C# Regular Expressions

I have a string that has multiple regular expression groups, and some parts of the string that aren't in the groups. I need to replace a character, in this case ^ only within the groups, but not in the parts of the string that aren't in a regex group.

Here's the input string:

STARTDONTREPLACEME^ENDDONTREPLACEME~STARTREPLACEME^ENDREPLACEME~STARTREPLACEME^BLAH^ENDREPLACEME~STARTDONTREPLACEME^BLAH^ENDDONTREPLACEME~

Here's what the output string should look like:

STARTDONTREPLACEME^ENDDONTREPLACEME~STARTREPLACEMEENDREPLACEME~STARTREPLACEMEBLAHENDREPLACEME~STARTDONTREPLACEME^BLAH^ENDDONTREPLACEME~

I need to do it using C# and can use regular expressions.

I can match the string into groups of those that should and shouldn't be replaced, but am struggling on how to return the final output string.

Upvotes: 0

Views: 231

Answers (3)

Alan Moore
Alan Moore

Reputation: 75222

Regex rgx = new Regex(
  @"\^(?=(?>(?:(?!(?:START|END)(?:DONT)?REPLACEME).)*)ENDREPLACEME)");

string s1 = rgx.Replace(s0, String.Empty);

Explanation: Each time a ^ is found, the lookahead scans ahead for an ending delimiter (ENDREPLACEME). If it finds one without seeing any of the other delimiters first, the match must have occurred inside a REPLACEME group. If the lookahead reports failure, it indicates that the ^ was found either between groups or within a DONTREPLACEME group.

Because lookaheads are zero-width assertions, only the ^ will actually be consumed in the event of a successful match.

Be aware that this will only work if delimiters are always properly balanced and groups are never nested within other groups.

Upvotes: 1

David Boike
David Boike

Reputation: 18615

If you are able to separate into groups that should be replaced and those that shouldn't, then instead of providing a single replacement string, you should be able to use a MatchEvaluator (a delegate that takes a Match and returns a string) to make the decision of which case it is currently dealing with and return the replacement string for that group alone.

You may also use an additional regex inside the MatchEvaluator. This solution produces the expected output:

Regex outer = new Regex(@"STARTREPLACEME.+ENDREPLACEME", RegexOptions.Compiled);
Regex inner = new Regex(@"\^", RegexOptions.Compiled);

string replaced = outer.Replace(start, m =>
{
    return inner.Replace(m.Value, String.Empty);
});

Upvotes: 0

holtavolt
holtavolt

Reputation: 4458

I'm not sure I get exactly what you're having trouble with, but it didn't take long to come up with this result:

string strRegex = @"STARTREPLACEME(.+)ENDREPLACEME";
RegexOptions myRegexOptions = RegexOptions.None;
Regex myRegex = new Regex(strRegex, myRegexOptions);
string strTargetString = @"STARTDONTREPLACEME^ENDDONTREPLACEME~STARTREPLACEME^ENDREPLACEME~STARTREPLACEME^BLAH^ENDREPLACEME~STARTDONTREPLACEME^BLAH^ENDDONTREPLACEME~";
string strReplace = "STARTREPLACEMEENDREPLACEME";

return myRegex.Replace(strTargetString, strReplace);

By using my favorite online Regex tool: http://regexhero.net/tester/

Is that helpful?

Upvotes: 1

Related Questions