Reputation: 273
I have following code snippet, The search criteria would be find all img tag that has id="someImage" value
<img id="someImage" src="C:\logo.png" height="64" width="104" alt="myImage" />
I want to replace
src="C:\logo.png" to src="someothervalue"
so the final output would be
<img id="someImage" src="C:\someothervalue" height="64" width="104" alt="myImage" />
How can i achieve this using regex.
Thank you.
Upvotes: 1
Views: 1017
Reputation: 19057
You can work with groups in a regex. You create groups by using parentheses in your regular expression. When you get a Match
object, this object will contain a Group
collection:
string input = "<html><img id=\"someImage\" src=\"C:\\logo.png\" height=\"64\" width=\"104\" alt=\"myImage\" /></html>";
var regex = new Regex("(<img(.+?)id=\"someImage\"(.+?))src=\"([^\"]+)\"");
string output = regex.Replace(
input,
match => match.Groups[1].Value + "src=\"someothervalue\""
);
In the example above there will be 5 groups:
Groups[0]
This is the whole match: <img id=\"someImage\" src=\"C:\\logo.png\"
Groups[1]
This is everything before the src
attribute: <img id=\"someImage\"
Groups[2]
and Groups[3]
are the (.+?)
parts.Groups[4]
is the original value of the src
attribute: C:\logo.png
In the example I replace the whole match for the value of Groups[1]
and a new src attribute.
Footnote: While regular expressions can sometimes be adequate for the job to manipulate an html document, it is often not the best way. If you know in advance that you are working with xhtml
, then you can use XmlDocument
+ XPath
. If it is html, then you can use HtmlAgilityPack.
Upvotes: 1
Reputation: 93050
It is not a good idea to use regex for XML. Depending on the language you should use some XML reader, extract the <img>
node and then get its id. One useful language for querying XML data, which is supported by many XML libraries is XPath.
In C# you can look at XmlDocument class (and related classes).
Another one is XmlReader.
The latter offers only sequential access, while the first one loads the whole tree in memory, so the first one is easier to use (especially if your XML content is not too big).
Upvotes: 1