Reputation: 45
Using VB.net, I have a string that contains HTML. The HTML has several img tags within it. I am trying to grab an entire particular img tag based on the src containing specific characters (image002) so that I can replace the entire image tag with some new code.
What I have so far:
dim bodyContent as string = "<html><body><img src='image001.png'/><img src='image002.png'/></body></html>"
dim searchStr as string = "image002"
Dim imgRegex As New Regex("@""<img.*?src=""(?" & searchStr & ".*?)"".*?>""", RegexOptions.IgnoreCase)
bodyContent = imgRegex.Replace(bodyContent, "<div class='newCode'><a href='https://mywebsite.net/ViewAttachment'><img src='https://mywebsite.net/ViewThumbnail'></a></div>")
However, my RegEx is not correct. Any advice on to get the correct RegEx?
Upvotes: 3
Views: 624
Reputation: 626950
You can use
Dim imgRegex As New Regex("<img[^>]+" & searchStr & "[^>]*>", RegexOptions.IgnoreCase)
The regex matches
<img
- <img
string[^>]+
- one or more chars other than >
& searchStr &
- the literal text inside searchStr
(note it works here like that because the variable only contains word chars, in a generic case, you need to escape it using Regex.Escape(searchStr)
)[^>]*>
- zero or more chars other than >
and then a >
char.Full VB.NET demo:
Dim bodyContent as string = "<html><body><img src='image001.png'/><img src='image002.png'/></body></html>"
Dim searchStr as string = "image002"
Dim imgRegex As New Regex("<img[^>]+" & searchStr & "[^>]*>", RegexOptions.IgnoreCase)
bodyContent = imgRegex.Replace(bodyContent, "<div class='newCode'><a href='https://mywebsite.net/ViewAttachment'><img src='https://mywebsite.net/ViewThumbnail'></a></div>")
Console.Write(bodyContent)
Output:
<html><body><img src='image001.png'/><div class='newCode'><a href='https://mywebsite.net/ViewAttachment'><img src='https://mywebsite.net/ViewThumbnail'></a></div></body></html>
Upvotes: 1