Amir
Amir

Reputation: 59

Finding two patterns in regular expression

I'm kind of new to regular expression.

I want to find all the tags that have src, and href in a Html page. I have found this and they are working separately, but not together.

string pattern = "<(?:[^>]*?\\s+)?src=([\"'])(.*?)\\1|<(?:[^>]*?\\s+)?href=([\"'])(.*?)\\1";

Any Idea?

Thanks.

Upvotes: 2

Views: 42

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626851

To parse HTML in C#, you should use a HTML parser, like HtmlAgilityPack.

As for "combining" 2 patterns with capturing groups and backreferences, you should always remember that the capturing groups are numbered from left to right regardless of alternation operators, so, in your pattern, there are 4 capturing groups (with ID = 1, 2, 3, 4), thus, you need to replace \\1 with \\3.

Upvotes: 1

Related Questions