Reputation: 11
I need some help with a VB RegEx.
I've got two RegEx that I need to do two specific things.
RegEx one - I am not exactly sure how to do this, but I need to get everything within a Href tag. i.e.
String = "<a href=""test.html"">"
I need the RegEx to return .... test.html
RegEx Two - I have partly got this working.
I've got tags like
RegEx = "<div class=""top""(.*?)</div>"
String = "<div class=""top""><a><b><div class=""bottom""></div></b></a></div>"
The problem I have is this isnt returning anything, it should return everything withing "top", but it returns nothing.
Upvotes: 1
Views: 118
Reputation: 53111
Well, if your html doesn't contain nested tags you can do the first part with regex (as long as you can control your search source code, you can be much more certain of your results).
\<a href=""([^""]+)\>
the test.html will be found in the non-passive group referred to as $1
.
The second part I'm concerned that you have nested tags in there and it's failing on that. The thing with regex and html is that regex can't delve well into the nested-allowable-but-not-best-practice code that can execute as expected but isn't well formed.
Can you post some search source for the second case so we can look?
Upvotes: 0
Reputation: 545943
Neither use-case can be solved well with regular expressions.
Use an HTML parser instead, e.g. the HTML Agility Pack.
Upvotes: 3