NomadTraveler
NomadTraveler

Reputation: 1084

C# Regex remove character within the element name only, without replacing the value

I want to get rid of ':' within the XML elements tags only, using regex in C#.

I am aware that parsing the document is the way to go instead of regex..but it is a legacy project and it uses Regex to replace the XML Document content. Not the ideal method to process XML Document, but nothing I can do about.

I am not good with regular expressions and just can't figure out a way to replace ':' only from the Element Tags and not values...

For example <tag:name> the value with the tag http://www.example.com </tag:name>

I want to replace : with _ only within the element name and not value. So the outcome should be :

<tag_name> the value with the tag http://www.example.com </tag_name>

Any idea?

Thanks!

Upvotes: 2

Views: 1053

Answers (2)

Michael Low
Michael Low

Reputation: 24506

Does this work for you?

Regex tagRegex = new Regex("<[^>]+>");
yourXML = tagRegex.Replace(yourXML, delegate(Match thisMatch)
{
   return thisMatch.Value.Replace(":", "_");
});

Upvotes: 1

Brigand
Brigand

Reputation: 86230

This needle should do what you want:

<[^>]*(:)[^>]*>

The first pattern group will contain the (:) in the tag name. If you want to do a replacement you can replace (<[^>]*)(:)([^>]*>) with $1_$3 where $1 and $3 are sub-patterns.

Upvotes: 2

Related Questions