Reputation: 59
I'm kind of new to regular expression.
I want to find all the tags that have src, and href in a Html page. I have found this and they are working separately, but not together.
string pattern = "<(?:[^>]*?\\s+)?src=([\"'])(.*?)\\1|<(?:[^>]*?\\s+)?href=([\"'])(.*?)\\1";
Any Idea?
Thanks.
Upvotes: 2
Views: 42
Reputation: 626851
To parse HTML in C#, you should use a HTML parser, like HtmlAgilityPack.
As for "combining" 2 patterns with capturing groups and backreferences, you should always remember that the capturing groups are numbered from left to right regardless of alternation operators, so, in your pattern, there are 4 capturing groups (with ID
= 1, 2, 3, 4), thus, you need to replace \\1
with \\3
.
Upvotes: 1